Parsing JSON

2015-10-20 Thread Papp, Stefan
Hi, I want to process data with JSON. Meaning, I have to receive JSON data and prepare this data for analytics. In the beginning, we might receive this data via files, but I assume soon we will switch to a streaming variant. What is currently the best recommended practice with Flink? Thank yo

RE: Hadoop ETLing with Flink

2015-04-20 Thread Papp, Stefan
display/Hive/HCatalog+InputOutput You probably have to do something like: HCatOutputFormat.setOutput(job, OutputJobInfo.create(dbName, outputTableName, null)); HCatSchema s = HCatOutputFormat.getTableSchema(job); HCatOutputFormat.setSchema(job, s); Let me know if you need more help writing to Hcatalog. On Mon, Ap

Hadoop ETLing with Flink

2015-04-20 Thread Papp, Stefan
Hi, I want load CSV files into a Hadoop cluster. How could I do that with Flink? I know, I can load data into a CsvReader and then iterate over rows and transform them. Is there an easy way to store the results into HDFS+HCatalog within Flink? Thank you! Stefan Papp Lead Hadoop Consultant

Quickstart build - Fat Jar is empty

2015-04-09 Thread Papp, Stefan
Hi, I am using * Apache Maven 3.3.1 * Java version: 1.8.0_40, vendor: Oracle Corporation (64 Bit) * OS name: "windows 7" Changes in the POM file were applied. Changing source and target version from 1.6 to 1.8 and uncommenting the part with lambda expressions. I try t