Re: Large dataset on hbase

2016-04-09 Thread Joe Witt
Prabhu,

If the dataset being processed can be split up and still retain the
necessary meaning when input to HBase I'd recommend doing that.  NiFI
itself, as a framework, can handle very large objects because its API
doesn't force loading of entire objects into memory.  However, various
processors may do that and I believe ReplaceText may be one that does.
You can use SplitText or ExecuteScript or other processors to do that
splitting if that will help your case.

Thanks
Joe

On Sat, Apr 9, 2016 at 6:35 PM, Simon Ball  wrote:
> Hi Prabhu,
>
> Did you try increasing the heap size in conf/bootstrap.conf? By default nifi
> uses a very small RAM allocation (512MB). You can increase this by tweaking
> java.arg.2 and .3 in the bootstrap.conf file. Note that this is the java
> heap, so you will need more than your data size to account for java object
> overhead. The other thing to check is the buffer sizes you are using for
> your replace text processors. If you’re also using Split processors, you can
> sometime run up against RAM and open file limits, if this is the case, make
> sure you increase the ulimit -n settings.
>
> Simon
>
> On 9 Apr 2016, at 16:51, prabhu Mahendran  wrote:
>
> Hi,
>
> I am new to nifi and does not know how to process large data like one gb csv
> data into hbase.while try combination of getFile and putHbase shell leads
> Java Out of memory error and also try combination of replace text, extract
> text and puthbasejson doesn't work on large dataset but it work correctly in
> smaller dataset.
> Can anyone please help me to solve this?
> Thanks in advance.
>
> Thanks & Regards,
> Prabhu Mahendran
>
>


Re: Large dataset on hbase

2016-04-09 Thread Simon Ball
Hi Prabhu,

Did you try increasing the heap size in conf/bootstrap.conf? By default nifi 
uses a very small RAM allocation (512MB). You can increase this by tweaking 
java.arg.2 and .3 in the bootstrap.conf file. Note that this is the java heap, 
so you will need more than your data size to account for java object overhead. 
The other thing to check is the buffer sizes you are using for your replace 
text processors. If you’re also using Split processors, you can sometime run up 
against RAM and open file limits, if this is the case, make sure you increase 
the ulimit -n settings.

Simon

On 9 Apr 2016, at 16:51, prabhu Mahendran 
mailto:prabhuu161...@gmail.com>> wrote:


Hi,

I am new to nifi and does not know how to process large data like one gb csv 
data into hbase.while try combination of getFile and putHbase shell leads Java 
Out of memory error and also try combination of replace text, extract text and 
puthbasejson doesn't work on large dataset but it work correctly in smaller 
dataset.
Can anyone please help me to solve this?
Thanks in advance.

Thanks & Regards,
Prabhu Mahendran



Large dataset on hbase

2016-04-09 Thread prabhu Mahendran
Hi,

I am new to nifi and does not know how to process large data like one gb
csv data into hbase.while try combination of getFile and putHbase shell
leads Java Out of memory error and also try combination of replace text,
extract text and puthbasejson doesn't work on large dataset but it work
correctly in smaller dataset.
Can anyone please help me to solve this?
Thanks in advance.

Thanks & Regards,
Prabhu Mahendran