[ 
https://issues.apache.org/jira/browse/HADOOP-2179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12541612
 ] 

stack commented on HADOOP-2179:
-------------------------------

I couldn't get the data file from the cited link (nor from its seeming alias at 
dblp.l3s.uni-hannover.de). Traceroute never makes it to the target from here 
trying from two different locations and attempts wget'ing on port 80 hang for 
ever. Perhaps its temporarily down?  Odd.  I tried the dblp from here 
http://dblp.uni-trier.de/xml/ but its a different format from what your regex 
expects.  So I made a little script to dump out 100k lines that will fit your 
regex pattern modelled on an rdf triple you pasted into mail a while back.  
Here's a sample:

{code}
...
<http://dblp.l3s.de/d2r/resource/publications/books/acm/kim95/AnnevelinkACFHK9599998>
 <http://purl.org/dc/elements/1.1/creator> 
<http://dblp.l3s.de/d2r/resource/authors/Jurgen_Annevelink99998>.
<http://dblp.l3s.de/d2r/resource/publications/books/acm/kim95/AnnevelinkACFHK9599999>
 <http://purl.org/dc/elements/1.1/creator> 
<http://dblp.l3s.de/d2r/resource/authors/Jurgen_Annevelink99999>.
{code}

I set up a cluster made of mapreduce and hbase only.  I didn't bother with hdfs 
since we're running local.  My hadoop-site.xml had these two properties set:

{code}
  <name>mapred.job.tracker</name>
  <value>localhost:9000</value>
</property>
<property>
  <name>mapred.system.dir</name>
  <value>/tmp/mapred/system</value>
</property>
{code}

Thats all the config. that differs from defaults.  I ran ./bin/start-mapred.sh 
to start up the mapred cluster and then ./src/contrib/hbase/bin/start-hbase.sh 
(You were running pseudo-distributed mode?).

Map ran fine.  Reduce is currently stuck at 84% reduce mark -- its loading the 
triples table in hbase.  I can see the entries going in by doing a query on the 
HQL page: 'select * from triples limit=10;' etc.  They are going in slow which 
is sort of what you'd expect running a single reduce.

I'll let it run.  Will report back w/ whether it completes or OOMEs.

> [hbase] OOME running in 'local' mode
> ------------------------------------
>
>                 Key: HADOOP-2179
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2179
>             Project: Hadoop
>          Issue Type: Bug
>          Components: contrib/hbase
>            Reporter: stack
>            Priority: Minor
>         Attachments: TriplesTest.java
>
>
> Holger Stenzhorn has been having issues running a mapreduce job that dumps 
> into a 'local' mode hbase.  Use this issue to figure whats going on.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to