t; it out the problem and have a fix?
>>
>> TIA,
>> Ramesh
>>
>>
>>
>
>
>
--
View this message in context:
http://www.nabble.com/help-with-map-reduce-tp22925481p24317804.html
Sent from the HBase User mailing list archive at Nabble.com.
JDK 1.6.0_12 and Hadoop/Hbase 0.19.1.
>>
>> On coding using some of the filter classes, eclipse hangs, and have no
>> other
>> option to continue it unless kill/restart the process. Does any body
>> figured
>> it out the problem and have a fix?
>>
Ramesh,
Interesting that you mention this. I have the same issue with the Scan
object. When I type
Scan scan = new Scan();
scan.addCo
and wait for the context help of Eclipse to open it freezes on me. Other
classes are fine. I also wondered what the issue is and started to look
into get
in context:
http://www.nabble.com/help-with-map-reduce-tp22925481p24289040.html
Sent from the HBase User mailing list archive at Nabble.com.
u
>> >> get results in chronological order.
>> >>
>> >> You can build equivalent indexes in hbase as in sql. You may find a
>> >> design
>> >> like this alieviates the need for extremely wide rows.
>> >>
>> >> Good luc
e this alieviates the need for extremely wide rows.
> >>
> >> Good luck!
> >>
> >> On Jun 11, 2009 11:44 AM, "Billy Pearson"
> >> wrote:
> >>
> >> That might be a good idea but you might be able to redesign you layout
> of
> &
gt; using a different key then the current one worth barnstorming.
>>
>> Billy
>>
>>
>>
>> "llpind" wrote in message
>> news:23975432.p...@talk.nabble.com...
>>
>> Sorry I forgot to mention the overflow then overflows into new row keys
>> per
>> 10,000 column entries ...
>>
>>
>
>
--
View this message in context:
http://www.nabble.com/Help-with-Map-Reduce-program-tp23952252p24002766.html
Sent from the HBase User mailing list archive at Nabble.com.
t; "llpind" wrote in message
> news:23975432.p...@talk.nabble.com...
>
> Sorry I forgot to mention the overflow then overflows into new row keys
> per
> 10,000 column entries ...
>
>
--
View this message in context:
http://www.nabble.com/Help-with-Map-Reduce-program-tp23952252p24002604.html
Sent from the HBase User mailing list archive at Nabble.com.
n the overflow then overflows into new row keys
> per
> 10,000 column entries ...
>
>
--
View this message in context:
http://www.nabble.com/Help-with-Map-Reduce-program-tp23952252p23989319.html
Sent from the HBase User mailing list archive at Nabble.com.
Also remember you might be able to convert to a tall table. Row keys can be
compound and you can do partial left matches on them. Eg:
Userid:timestamp:eventid
now you have a tall table. Do prefix matches on the userid you want and you
get results in chronological order.
You can build equivalent
ions of columns to values. Thats
why
Map/Reduce is having problems as well (Java Heap exception). I've upped
mapred.child.java.opts, but problem presists.
See also HBASE-867: https://issues.apache.org/jira/browse/HBASE-867
St.Ack
--
View this message in context:
http://www.nabb
lumns to values. Thats
>>> why
>>> Map/Reduce is having problems as well (Java Heap exception). I've upped
>>> mapred.child.java.opts, but problem presists.
>>>
>>
>> See also HBASE-867: https://issues.apache.org/jira/browse/HBASE-867
>> St.Ack
>>
>>
>
>
--
View this message in context:
http://www.nabble.com/Help-with-Map-Reduce-program-tp23952252p23975432.html
Sent from the HBase User mailing list archive at Nabble.com.
s
>> why
>> Map/Reduce is having problems as well (Java Heap exception). I've upped
>> mapred.child.java.opts, but problem presists.
>>
>
> See also HBASE-867: https://issues.apache.org/jira/browse/HBASE-867
> St.Ack
>
>
--
View this message in conte
On Wed, Jun 10, 2009 at 4:52 PM, llpind wrote:
>
> Thanks. I think the problem is I have potentially millions of columns.
>
> where a given RowResult can hold millions of columns to values. Thats why
> Map/Reduce is having problems as well (Java Heap exception). I've upped
> mapred.child.jav
>> llpind wrote:
>> >>>
>> >>> Thanks.
>> >>>
>> >>> Yea I've got that colFam for sure in the HBase table:
>> >>>
>> >>> {NAME => 'tableA', FAMILIES => [{NAME => 'colFam1'
t;>
> >>> {NAME => 'tableA', FAMILIES => [{NAME => 'colFam1', VERSIONS => '3',
> >>> COMPRESSION => 'NONE', LENGTH => '2147483647',
> >>> TTL => '-1', IN_MEMORY => 'false'
IN_MEMORY => 'false',
>>> BLOCKCACHE => 'false'}]}
>>>
>>>
>>> I've been trying to play with rowcounter, and not having much luck
>>> either.
>>>
>>> I run the command:
>>> hadoop19/bin/hadoop org.apache.hadoop.hbase.mapred.Driver rowcounter
>>> /home/hadoop/dev/rowcounter7 tableA colFam1:
>>>
>>>
>>> The map/reduce finishes just like it does with my own program, but with
>>> all part files empty in /home/hadoop/dev/rowcounter7.
>>>
>>> Any Ideas?
>>>
>>>
>>
>> --
>> View this message in context:
>> http://www.nabble.com/Help-with-Map-Reduce-program-tp23952252p23967196.html
>> Sent from the HBase User mailing list archive at Nabble.com.
>>
>>
>
>
>
>
--
View this message in context:
http://www.nabble.com/Help-with-Map-Reduce-program-tp23952252p23971190.html
Sent from the HBase User mailing list archive at Nabble.com.
p19/bin/hadoop org.apache.hadoop.hbase.mapred.Driver rowcounter
/home/hadoop/dev/rowcounter7 tableA colFam1:
The map/reduce finishes just like it does with my own program, but with
all part files empty in /home/hadoop/dev/rowcounter7.
Any Ideas?
--
View this message in context:
http://www.nabble.com/Help-with-Map-Reduce-program-tp23952252p23967196.html
Sent from the HBase User mailing list archive at Nabble.com.
to play with rowcounter, and not having much luck either.
>
> I run the command:
> hadoop19/bin/hadoop org.apache.hadoop.hbase.mapred.Driver rowcounter
> /home/hadoop/dev/rowcounter7 tableA colFam1:
>
>
> The map/reduce finishes just like it does with my own program, but with
> all part files empty in /home/hadoop/dev/rowcounter7.
>
> Any Ideas?
>
>
--
View this message in context:
http://www.nabble.com/Help-with-Map-Reduce-program-tp23952252p23967196.html
Sent from the HBase User mailing list archive at Nabble.com.
t;,
> >>>> ReducerClass.class, c );
> >>>>
> >>>> MapperClass:
> >>>> =
> >>>>
> >>>> @Override
> >>>> public void map(
> >>&
Result row,
>>>> OutputCollector collector,
>>>> Reporter reporter) throws IOException {
>>>>
>>>>
>>>> reporter.incrCounter(Counters.ROWS, 1);
>>>> collector.collect(key, row);
>>>> }
>>>>
>>&g
porter r) throws IOException {
while (v.hasNext()){
BatchUpdate bu = new BatchUpdate(k.get());
while (v.hasNext()){
RowResult row = v.next();
bu.put(Bytes.toBytes("count:rowToCountName"),
Bytes.toBytes(row.size()));
}
c.collect(k, bu);
}
}
==
tor c,
>> Reporter r) throws IOException {
>>
>> while (v.hasNext()){
>> BatchUpdate bu = new BatchUpdate(k.get());
>> while (v.hasNext()){
>> RowResult row = v.next();
>> bu.put(Bytes.toBytes("count:rowToCountName"),
>> Bytes.toBytes(row.size())
(v.hasNext()){
RowResult row = v.next();
bu.put(Bytes.toBytes("count:rowToCountName"),
Bytes.toBytes(row.size()));
}
c.collect(k, bu);
}
}
It runs the map/reduce, but I get nothing in
}
c.collect(k, bu);
}
}
It runs the map/reduce, but I get nothing in my output table.
Thanks.
llpind
--
View this message in context:
http://www.nabble.com/Help-with-Map-Reduce-program-tp23952252p23952252.html
Sent from the HBase User mailing list archive at Nabble.com.
Hi Lars,
thanks for your suggesstion... i will try this out 2day :)
thanks once again
Rakhi
On Thu, Apr 9, 2009 at 6:58 PM, Lars George wrote:
> Hi Rakhi,
>
> The second part was meant to say: "...Setting it to *false*activates
> the...", so call it like this:
>
>
> final RowFilterI
Hi Rakhi,
The second part was meant to say: "...Setting it to *false*activates
the...", so call it like this:
final RowFilterInterface colFilter = new
ColumnValueFilter("Status:".getBytes(), ColumnValueFilter.CompareOp.EQUAL,
"UNCOLLECTED".getBytes(), false);
Regards,
Lars
PS: And sorry
Hi Rahki,
Looking through the code of the ColumnValueFilter again, it seems it
does what you want when you add the extra "filterIfColumnMissing"
parameter to the constructor and set it to "false". The default "true"
does the column filtering and will return all rows that have that
column. Set
Hi Lars,
Hmm... i had a look at other filters.. but i thought
ColumnValueFilter would be more appropriate coz in the constructor we could
mention the column name and the value.
Probably i am going wrong there.
what i want is to filter out all the rows based on some column value. w
Hi Rakhi,
Sorry, not yet. This is not an easy thing to replicate. I will try
though over the next few days if I find time. A few things to note
though first. The way filters work is that they do *not* let filtered
rows through but actually filters them out. That means you logic seems
reversed
Hi Lars,
Just wanted to follow up, did you try out the column value
filter? did it work??
i really need it to improve the performance of my map-reduce programs.
Thanks a ton,
Raakhi
On Wed, Apr 8, 2009 at 12:49 PM, Rakhi Khatwani wrote:
> Hi Lars,
>
> Well the details are as follow
Hi Lars,
Well the details are as follows:
table1 has the rowkey as some url, and 2 ColumnFamilies as described below:
one columnFamily called content and
one columnFamily called status [which takes the values ANALYSED, UNANALYSED]
(all in upper case... i checked it, there is no issue with the
sp
Hi Rakhi,
Wow, same here. I copied your RowFilter line and when I press the dot
key and the fly up opens Eclipse hangs. Nice... NOT!
Apart from that, you are also saying that the filter is not working as
expected? Do you use any column qualifiers for the "Status:" column? Are
the values in t
>>> Hi,
>>>>>i have a map reduce program with which i read from a hbase table.
>>>>> In my map program i check if the column value of a is xxx, if yes then
>>>>> continue with processing else skip it.
>>>>> however if my table is really big, most of my time in the map gets
>>>>> wasted
>>>>> for processing unwanted rows.
>>>>> is there any way through which we could send a subset of rows (based
>>>>> on
>>>>>
>>>>>
>>>> the
>>>>
>>>>
>>>>> value of a particular column family) to the map???
>>>>>
>>>>> i have also gone through TableInputFormatBase but am not able to
>>>>> figure
>>>>>
>>>>>
>>>> out
>>>>
>>>>
>>>>> how do we set the input format if we are using TableMapReduceUtil
>>>>> class
>>>>>
>>>>>
>>>> to
>>>>
>>>>
>>>>> initialize table map jobs. or is there any other way i could use it.
>>>>>
>>>>> Thanks in Advance,
>>>>> Raakhi.
>>>>>
>>>>>
>>>>>
>>>>
>>>
>>>
>>
>
>
--
View this message in context:
http://www.nabble.com/help-with-map-reduce-tp22925481p22943183.html
Sent from the HBase User mailing list archive at Nabble.com.
Hi,
I did try the filter... but using ColumnValueFilter. i declared a
ColumnValueFilter as follows:
public class TableInputFilter extends TableInputFormat
implements JobConfigurable {
public void configure(final JobConf jobConf) {
setHtable(tablename);
Hi Rakhi,
The way the filters work is that you either use the supplied filters or
create your own subclasses - but then you will have to deploy that class
to all RegionServers while adding it to their respective hbase-env.sh
(in the "export HBASE_CLASSPATH" variable). We are discussing current
Maybe I don't understand, but if you have done the filter and extended
tableinputformat, you can run a MR job with:
JobConf conf = new JobConf(...);
conf.setInputFormat(YourTableInputFormat.class);
Cheers,
Tim
On Tue, Apr 7, 2009 at 11:50 AM, Rakhi Khatwani
wrote:
> Thanks Ryan, i will try
Thanks Ryan, i will try that
On Tue, Apr 7, 2009 at 3:05 PM, Ryan Rawson wrote:
> there is a server-side mechanism to filter rows, it's found in the
> org.apache.hadoop.hbase.filter package. im not sure how this interops with
> the TableInputFormat exactly.
>
> setting a filter to reduce the #
Hi Tim,
i made a class which extends table input format base and set Htable ,
InputColumns and the row filter. but i don't know how to set that class as
an input to my map reduce program.
Currently i am using TableMapReduceUtil to set my tablename and
columnFamilies to set the input to my ma
I am a newbie, but...
I think it will boil down to something looking at the column and
applying the filter. I don't think without reworking the model or
adding some kind of index you would get around this.
Why not set a RowFilter to the TableInputFormat and then it is
filtered before your map -
there is a server-side mechanism to filter rows, it's found in the
org.apache.hadoop.hbase.filter package. im not sure how this interops with
the TableInputFormat exactly.
setting a filter to reduce the # of rows returned is pretty much exactly
what you want.
On Tue, Apr 7, 2009 at 2:26 AM, Rakh
Hi,
i have a map reduce program with which i read from a hbase table.
In my map program i check if the column value of a is xxx, if yes then
continue with processing else skip it.
however if my table is really big, most of my time in the map gets wasted
for processing unwanted rows.
is there a
42 matches
Mail list logo