Re: Large dataset on hbase

2016-04-13 Thread prabhu Mahendran
Hi, 1.Is the output of your Pig script a single file that contains all the JSON documents corresponding to your CSV? Yes output of my pig script having all json documents corresponding to the CSV. 2.Also, are there any errors in logs/nifi-app.log (or on the processor in the UI) when this happens

Re: Large dataset on hbase

2016-04-12 Thread Bryan Bende
Is the output of your Pig script a single file that contains all the JSON documents corresponding to your CSV? or does it create a single JSON document for each row of the CSV? Also, are there any errors in logs/nifi-app.log (or on the processor in the UI) when this happens? -Bryan On Tue, Apr 1

Re: Large dataset on hbase

2016-04-12 Thread prabhu Mahendran
Hi, I just use Pig Script to convert the CSV into JSON with help of ExecuteProcess. In my case i have use n1 from JSON document which could be stored as row key in HBase Table.So n2-n22 store as columns in hbase. some of rows (n1's) are stored inside the table but remaining are read well but not

Re: Large dataset on hbase

2016-04-12 Thread Bryan Bende
Hi Prabhu, How did you end up converting your CSV into JSON? PutHBaseJSON creates a single row from a JSON document. In your example above, using n1 as the rowId, it would create a row with columns n2 - n22. Are you seeing columns missing, or are you missing whole rows from your original CSV? Th

Re: Large dataset on hbase

2016-04-11 Thread prabhu Mahendran
Hi Simon/Joe, Thanks for this support. I have successfully converted the CSV data into JSON and also insert those JSON data into Hbase Table using PutHBaseJSon. Part of JSON Sample Data like below: { "n1":"", "n2":"", "n3":"", "n4":"","n5":"","n6":"", "n7":"", "n8":"", "n9":"", "n10":"","n11":"",

Re: Large dataset on hbase

2016-04-09 Thread Joe Witt
Prabhu, If the dataset being processed can be split up and still retain the necessary meaning when input to HBase I'd recommend doing that. NiFI itself, as a framework, can handle very large objects because its API doesn't force loading of entire objects into memory. However, various processors

Re: Large dataset on hbase

2016-04-09 Thread Simon Ball
Hi Prabhu, Did you try increasing the heap size in conf/bootstrap.conf? By default nifi uses a very small RAM allocation (512MB). You can increase this by tweaking java.arg.2 and .3 in the bootstrap.conf file. Note that this is the java heap, so you will need more than your data size to account

Large dataset on hbase

2016-04-09 Thread prabhu Mahendran
Hi, I am new to nifi and does not know how to process large data like one gb csv data into hbase.while try combination of getFile and putHbase shell leads Java Out of memory error and also try combination of replace text, extract text and puthbasejson doesn't work on large dataset but it work corr