[
https://issues.apache.org/jira/browse/HADOOP-2179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12541612
]
stack commented on HADOOP-2179:
-------------------------------
I couldn't get the data file from the cited link (nor from its seeming alias at
dblp.l3s.uni-hannover.de). Traceroute never makes it to the target from here
trying from two different locations and attempts wget'ing on port 80 hang for
ever. Perhaps its temporarily down? Odd. I tried the dblp from here
http://dblp.uni-trier.de/xml/ but its a different format from what your regex
expects. So I made a little script to dump out 100k lines that will fit your
regex pattern modelled on an rdf triple you pasted into mail a while back.
Here's a sample:
{code}
...
<http://dblp.l3s.de/d2r/resource/publications/books/acm/kim95/AnnevelinkACFHK9599998>
<http://purl.org/dc/elements/1.1/creator>
<http://dblp.l3s.de/d2r/resource/authors/Jurgen_Annevelink99998>.
<http://dblp.l3s.de/d2r/resource/publications/books/acm/kim95/AnnevelinkACFHK9599999>
<http://purl.org/dc/elements/1.1/creator>
<http://dblp.l3s.de/d2r/resource/authors/Jurgen_Annevelink99999>.
{code}
I set up a cluster made of mapreduce and hbase only. I didn't bother with hdfs
since we're running local. My hadoop-site.xml had these two properties set:
{code}
<name>mapred.job.tracker</name>
<value>localhost:9000</value>
</property>
<property>
<name>mapred.system.dir</name>
<value>/tmp/mapred/system</value>
</property>
{code}
Thats all the config. that differs from defaults. I ran ./bin/start-mapred.sh
to start up the mapred cluster and then ./src/contrib/hbase/bin/start-hbase.sh
(You were running pseudo-distributed mode?).
Map ran fine. Reduce is currently stuck at 84% reduce mark -- its loading the
triples table in hbase. I can see the entries going in by doing a query on the
HQL page: 'select * from triples limit=10;' etc. They are going in slow which
is sort of what you'd expect running a single reduce.
I'll let it run. Will report back w/ whether it completes or OOMEs.
> [hbase] OOME running in 'local' mode
> ------------------------------------
>
> Key: HADOOP-2179
> URL: https://issues.apache.org/jira/browse/HADOOP-2179
> Project: Hadoop
> Issue Type: Bug
> Components: contrib/hbase
> Reporter: stack
> Priority: Minor
> Attachments: TriplesTest.java
>
>
> Holger Stenzhorn has been having issues running a mapreduce job that dumps
> into a 'local' mode hbase. Use this issue to figure whats going on.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.