Sorry I forgot to mention the overflow then overflows into new row keys per
10,000 column entries (or some other split number).
llpind wrote:
>
>
> When is the plan for releasing .20? This particular issue is really
> important to us.
>
> Stack, I also have another question: The problem we
When is the plan for releasing .20? This particular issue is really
important to us.
Stack, I also have another question: The problem we are trying to solve
doesn't really need the extra layer present in HBase (BigTable) structure
(RowResult holds row key and a HashMap of column name, value).
cambridgemike wrote:
>
>
> -tried moving hbase-0.19.2.jar to the hadoop/lib folder of all the slave
> machines.
>
>
Hmm thats weird. Moving the hbase jars solved my issue. go to the job
tracker UI, and look at what machine is throwing the exception, and make
sure you have hbase jars in
monty,
2 things you can do:
1- Serialize courses data into the courses family in the student
table. You duplicate data, but disk is cheap so that's ok now.
2- If all you need is to first show the course id and the course title
(or description), you can just put that as the value in the courses
f
On Wed, Jun 10, 2009 at 4:52 PM, llpind wrote:
>
> Thanks. I think the problem is I have potentially millions of columns.
>
> where a given RowResult can hold millions of columns to values. Thats why
> Map/Reduce is having problems as well (Java Heap exception). I've upped
> mapred.child.jav
Hi, I'm having almost the exact same problem (this rowcounter jar is one I
compiled myself)
./bin/hadoop jar rowcounter.jar org.myorg.RowCounter /user/myUser/output/
TABLE_NAME
09/06/10 19:44:19 INFO mapred.TableInputFormatBase: split:
0->domain:,18226263
09/06/10 19:44:19 INFO mapred.TableInput
Thanks. I think the problem is I have potentially millions of columns.
where a given RowResult can hold millions of columns to values. Thats why
Map/Reduce is having problems as well (Java Heap exception). I've upped
mapred.child.java.opts, but problem presists.
Ryan Rawson wrote:
>
> Hey,
Hey,
A scanner's lease expires in 60 seconds. I'm not sure what version you are
using, but try:
table.setScannerCaching(1);
This way you won't retrieve 60 rows that each take 1-2 seconds to process.
This is the new default value in 0.20, but I don't know if it ended up in
0.19.x anywhere.
On
Hey,
Looks lke you have some HDFS issues.
Things I did to make myself stable:
- run HDFS with -Xmx=2000m
- run HDFS with 2047 xciever limit (goes into hdfs-core.xml or
hadoop-site.xml)
- ulimit -n 32k - also important
With this I find that HDFS is very stable, I've imported hundreds of gigs.
Y
Thanks so much for all the help, everyone... things are still broken,
but maybe we're getting close.
All the regionservers were dead by the time the job ended. I see
quite a few error messages like this:
(I've put the entirety of the regionserver logs on pastebin:)
http://pastebin.com/m2e6f9283
That is a client exception that is a sign of problems on the
regionserver...is it still running? What do the logs look like?
On Jun 10, 2009 2:51 PM, "Bradford Stephens"
wrote:
OK, I've tried all the optimizations you've suggested (still running
with a M/R job). Still having problems like this:
Also, there's a slight variation: "Trying to contact region server
Some server for region joinedcontent"
"Some server"? Interesting :)
On Wed, Jun 10, 2009 at 2:50 PM, Bradford
Stephens wrote:
> OK, I've tried all the optimizations you've suggested (still running
> with a M/R job). Still having p
OK, I've tried all the optimizations you've suggested (still running
with a M/R job). Still having problems like this:
org.apache.hadoop.hbase.client.RetriesExhaustedException: Trying to
contact region server 192.168.18.15:60020 for region
joinedcontent,242FEB3ED9BE0D8EF3856E9C4251464C,12446665943
Okay, I think I got it figured out.
although when scanning large row keys I do get the following exception:
NativeException: java.lang.RuntimeException:
org.apache.hadoop.hbase.UnknownScannerException:
org.apache.hadoop.hbase.UnknownScannerException: -4424757523660246367
at
org.apache.ha
might look in to the api for there packages
org.apache.hadoop.hbase.regionserver.tableindexed
org.apache.hadoop.hbase.client.tableindexed
http://hadoop.apache.org/hbase/docs/r0.19.3/api/index.html
Not sure anything about them I never used but I thank it allows a index on
columns
Billy
"Navee
Yes that's what scanners are good for they will return all the
columns:lables combos for a row
What does the MR job stats say for rows processed for the maps and reduces?
Billy Pearson
"llpind" wrote in
message news:23967196.p...@talk.nabble.com...
also,
I think what we want is a way to
All the columns for any row key will be stored on one server hosted by one
region
the regions are split by row key not columns
So all the columns for rowx will be only in one region on one server.
A table is made up of regions 1 to start with as more rows are added the
regions split by row
eac
also,
I think what we want is a way to wildcard everything after colFam1: (e.g.
colFam1:*). Is there a way to do this in HBase?
This is assuming we dont know the column name, we want them all.
llpind wrote:
>
> Thanks.
>
> Yea I've got that colFam for sure in the HBase table:
>
> {NAME =
rowcounter counts rows only. it does not produce any output.
St.Ack
On Wed, Jun 10, 2009 at 10:03 AM, llpind wrote:
>
> Thanks.
>
> Yea I've got that colFam for sure in the HBase table.
>
> I've been trying to play with rowcounter, and not having much luck either.
>
> I run the command:
> hadoo
Thanks.
Yea I've got that colFam for sure in the HBase table.
I've been trying to play with rowcounter, and not having much luck either.
I run the command:
hadoop19/bin/hadoop org.apache.hadoop.hbase.mapred.Driver rowcounter
/home/hadoop/dev/rowcounter7 tableA colFam1:
The map/reduce finishe
Billy,
By saying "columns for key1 will not be on all the nodes but just one node
in the cluster", you really mean "columns of the SAME family for key1...",
right?
Please correct me if I am wrong, but I think for the row key "key1", the
data value of "familyA:lableX" and that of "familyB:labelY"
That's correct - if you meant "it will have to scan EACH row in that column
family with atleast one non-empty cell".
>From http://wiki.apache.org/hadoop/Hbase/HbaseArchitecture:
"Each column family in a region is managed by an *HStore*. Each HStore may
have one or more *MapFiles* (a Hadoop HDFS fi
Hi All
I define two tables having one to many relationship:
Student:
student id
student data (name, address, ...)
courses (use course ids as column qualifiers here)
Course:
course id
course data (name, syllabus, ...)
My problem is, using Java client/program how to fetch cour
On Tue, Jun 9, 2009 at 11:51 AM, Bradford Stephens <
bradfordsteph...@gmail.com> wrote:
> I sort of need the reduce since I'm combining primary keys from a CSV
> file. Although I guess I could just use the combiner class... hrm.
>
> How do I decrease the batch size?
Below is from hbase-default.
You have seen this list: http://wiki.apache.org/hadoop/Hbase/PoweredBy?
These are the folks who volunteered to share the fact that they are using
hbase in production.
St.Ack
2009/6/9 Jürgen Kaatz
> Hi,
>
> can anybody tell me, if one uses Hbase/Hadoop in a production environment?
> Any hints wou
25 matches
Mail list logo