I already mentioned that here: https://groups.google.com/forum/#!topic/nosql-databases/ZWyc4zDursg ... . I'm not sure if it is a issue. After setting the "batch size" everything worked nice for me.
Anyway, that was another problem :) If there would be a Filter my current code would work fine with the HBase Java API. John 2013/10/24 Jean-Marc Spaggiari <jean-m...@spaggiari.org> > If the MR crash because of the number of columns, then we have an issue > that we need to fix ;) Please open a JIRA provide details if you are facing > that. > > Thanks, > > JM > > > 2013/10/24 John <johnnyenglish...@gmail.com> > > > @Jean-Marc: Sure, I can do that, but thats a little bit complicated > because > > the the rows has sometimes Millions of Columns and I have to handle them > > into different batches because otherwise hbase crashs. Maybe I will try > it > > later, but first I want to try the API version. It works okay so far, > but I > > want to improve it a little bit. > > > > @Ted: I try to modify it, but I have no idea how exactly do this. I've to > > count the number of columns in that filter (that works obviously with the > > count field). But there is no Method that is caleld after iterating over > > all elements, so I can not return the Drop ReturnCode in the > filterKeyValue > > Method because I did'nt know when it was the last one. Any ideas? > > > > regards > > > > > > 2013/10/24 Ted Yu <yuzhih...@gmail.com> > > > > > Please take a look > > > at > > src/main/java/org/apache/hadoop/hbase/filter/ColumnCountGetFilter.java : > > > > > > * Simple filter that returns first N columns on row only. > > > > > > You can modify the filter to suit your needs. > > > > > > Cheers > > > > > > > > > On Thu, Oct 24, 2013 at 7:52 AM, John <johnnyenglish...@gmail.com> > > wrote: > > > > > > > Hi, > > > > > > > > I'm write currently a HBase Java programm which iterates over every > row > > > in > > > > a table. I have to modiy some rows if the column size (the amount of > > > > columns in this row) is bigger than 25000. > > > > > > > > Here is my sourcode: http://pastebin.com/njqG6ry6 > > > > > > > > Is there any way to add a Filter to the scan Operation and load only > > rows > > > > where the size is bigger than 25k? > > > > > > > > Currently I check the size at the client, but therefore I have to > load > > > > every row to the client site. It would be better if the wrong rows > > > already > > > > filtered at the "server" site. > > > > > > > > thanks > > > > > > > > John > > > > > > > > > >