Big Data Company Datameer Makes Apache Hadoop a Desktop Tool

Apache Hadoop is a software framework that supports data-intensive distributed applications. It enables applications to work with thousands of computational independent computers and petabytes of data. Hadoop was derived from Google’s MapReduce and Google File System (GFS) papers.

Over the past couple of years, Hadoop has turned into the rock star of megascale data center software, with just about every Web company of consequence — Yahoo, Amazon.com, Facebook and Twitter.

“Our goal is to really democratize data analytics by giving our users the tools they need to make data-driven decisions faster,” said Stefan Groschupf, CEO of Datameer. “By bypassing the traditional, slow, multi-step process of creating static schemas, we enable users to get right to analyzing and visualizing data without needing to rely on IT.”

Hadoop is a top-level Apache project being built and used by a global community of contributors, written in the Java programming language. Hadoop consists of the Hadoop Common, which provides access to the filesystems supported by Hadoop. The Hadoop Common package contains the necessary JAR files and scripts needed to start Hadoop. The package also provides source code, documentation, and a contribution section which includes projects from the Hadoop Community.

A small Hadoop cluster will include a single master and multiple worker nodes. The master node consists of a JobTracker, TaskTracker, NameNode, and DataNode. A slave or worker node acts as both a DataNode and TaskTracker, though it is possible to have data-only worker nodes, and compute-only worker nodes; these are normally only used in non-standard applications. Hadoop requires JRE 1.6 or higher. The standard startup and shutdown scripts require ssh to be set up between nodes in the cluster.

In a larger cluster, the HDFS is managed through a dedicated NameNode server to host the filesystem index, and a secondary NameNode that can generate snapshots of the namenode’s memory structures, thus preventing filesystem corruption and reducing loss of data. Similarly, a standalone JobTracker server can manage job scheduling. In clusters where the Hadoop MapReduce engine is deployed against an alternate filesystem, the NameNode, secondary NameNode and DataNode architecture of HDFS is replaced by the filesystem-specific equivalent.

Datameer, a company that allows users to analyze massive amounts of data without technical know-how, was founded in 2009 and is headquartered in San Mateo, California.

Datameer specializes in analysis of large volumes of data for business users of Apache Hadoop. The company’s product, Datameer Analytics Solution (DAS), is a BI platform for Hadoop and includes data source integration, an analytics engine with a spreadsheet interface designed for business users with over 180 analytic functions and visualization including reports, charts and dashboards. DAS is available for all major Hadoop distributions including Apache, Cloudera, EMC GreenPlum HD, IBM BigInsights, MapR, Yahoo!, and Amazon.

The company created a user dashboard to easily feed and analyze data into Apache Hadoop, an open-source software that processes large amounts of data sets and spits out analytics as well as reporting. The benefit of the tool comes to those that don’t have a technical background and thus wouldn’t be able to use Apache Hadoop.

Datameer, launched a revolutionary new release of its big data analytics solution, which seamlessly combines data integration, analytics and visualization of any data type, size, or source, in one easy-to-use application. For the first time ever, Datameer 2.0 brings the power of Apache Hadoop directly to the desktop, with Hadoop natively embedded in two of three new editions of the application. Datameer Personal runs on a single desktop, while Datameer Workgroup runs on a single server. Datameer Enterprise scales to thousands of nodes and runs on any Hadoop distribution.

“The need to analyze more data and increase the speed of analysis are the top two demands of big data technologies, which is what Datameer 2.0 is addressing with its newest release,” said Mark Smith, CEO and Chief Research Officer of Ventana Research. “Organizations have been lacking intuitive visualizations of big data from Hadoop, and Datameer’s new Business Infographics provide a major leap forward in revolutionizing analytics for business.”

Big Data Analytics on any device

With the introduction of the Business Infographics Designer, Datameer 2.0 breaks traditional box after box style of dashboards by giving users complete graphics and visualization design control. The vector-based WYSIWYG designer enables users to choose exactly how they want to visualize their data, whether its stunning infographics or more traditional reports, maps, graphs and dashboards.

“Nurago uses Datameer in large part because its makes us very efficient in getting the analytics we need to help drive our day to day business,” said Nikolaus Pohle, CTO at Nurago. “The new user interface in Datameer 2.0 takes ease-of-use to a whole new level and certain features like the new Finder and the Sheet Dependency Overlay have had a direct impact on how quickly and efficiently we can get to, and act on, our insights.”

Built on HTML5, Datameer 2.0 runs on any device, letting users work with and visualize their data on smart phones, tablets, desktop and laptop computers. Datameer supports all of the popular operating systems including Windows and Mac OS as well as Linux and VMWare.

Datameer 2.0 also supports more data sources than ever, adding to an already extensive list and easy-to-use API for custom integration. New sources in 2.0 include Twitter and Facebook, Netezza and COBOL as well as Teradata export. Datameer also offers improved HIVE integration, including the ability to export to HIVE as well as the previously supported HIVE data import.

Story Highlights:
  • Datameer created a user dashboard to easily feed and analyze data into Apache Hadoop, an open-source software that processes large amounts of data sets and spits out analytics as well as reporting.
  • The benefit of the tool comes to those that don’t have a technical background and thus wouldn’t be able to use Apache Hadoop.
  • On May 16, 2011, Datameer announced a $9.25M B financing round led by venture capital firm Kleiner Perkins, Caufield and Byers.

Leave a Reply