How would the MapReduceIndexerTool (MRIT for short)
find the local disk to write from HDFS to for each shard?
All it has is the information in the Solr configs, which are
usually relative paths on the local Solr machines, relative
to SOLR_HOME. Which could be different on each node
(that would be screwy, but possible).

Permissions would also be a royal pain to get right....

You _can_ forego the --go-live option and copy from
the HDFS nodes to your local drive and then execute
the "mergeIndexes" command, see:
https://cwiki.apache.org/confluence/display/solr/Merging+Indexes
Note that there is the MergeIndexTool, but there's also
the Core Admin command.

The sub-indexes are in a partition in HDFS and numbered
sequentially.

Best,
Erick

On Wed, Jul 2, 2014 at 3:23 PM, Tom Chen <tomchen1...@gmail.com> wrote:
> Hi,
>
>
> When we run Solr Map Reduce Indexer Tool (
> https://github.com/markrmiller/solr-map-reduce-example), it generates
> indexes on HDFS
>
> The last stage is Go Live to merge the generated index to live SolrCloud
> index.
>
> If the live SolrCloud write index to local file system (rather than HDFS),
> the Go Live gives such error like this:
>
> 2014-07-02 13:41:01,518 INFO org.apache.solr.hadoop.GoLive: Live merge
> hdfs://
> bdvs086.test.com:9000/tmp/0000088-140618120223665-oozie-oozi-W/results/part-00000
> into http://bdvs087.test.com:8983/solr
> 2014-07-02 13:41:01,796 ERROR org.apache.solr.hadoop.GoLive: Error sending
> live merge command
> java.util.concurrent.ExecutionException:
> org.apache.solr.client.solrj.impl.HttpSolrServer$RemoteSolrException:
> directory '/opt/testdir/solr/node/hdfs:/
> bdvs086.test.com:9000/tmp/0000088-140618120223665-oozie-oozi-W/results/part-00001/data/index'
> does not exist
> at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:233)
> at java.util.concurrent.FutureTask.get(FutureTask.java:94)
> at org.apache.solr.hadoop.GoLive.goLive(GoLive.java:126)
> at
> org.apache.solr.hadoop.MapReduceIndexerTool.run(MapReduceIndexerTool.java:867)
> at
> org.apache.solr.hadoop.MapReduceIndexerTool.run(MapReduceIndexerTool.java:609)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
> at
> org.apache.solr.hadoop.MapReduceIndexerTool.main(MapReduceIndexerTool.java:596)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:60)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:37)
> at java.lang.reflect.Method.invoke(Method.java:611)
> at
> org.apache.oozie.action.hadoop.LauncherMapper.map(LauncherMapper.java:491)
> at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
> at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:434)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
> at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
> at java.security.AccessController.doPrivileged(AccessController.java:310)
> at javax.security.auth.Subject.doAs(Subject.java:573)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1502)
> at org.apache.hadoop.mapred.Child.main(Child.java:249)
> Caused by:
> org.apache.solr.client.solrj.impl.HttpSolrServer$RemoteSolrException:
> directory '/opt/testdir/solr/node/hdfs:/
> bdvs086.test.com:9000/tmp/0000088-140618120223665-oozie-oozi-W/results/part-00001/data/index'
> does not exist
> at
> org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:495)
> at
> org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:199)
> at
> org.apache.solr.client.solrj.request.CoreAdminRequest.process(CoreAdminRequest.java:493)
> at org.apache.solr.hadoop.GoLive$1.call(GoLive.java:100)
> at org.apache.solr.hadoop.GoLive$1.call(GoLive.java:89)
> at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:314)
> at java.util.concurrent.FutureTask.run(FutureTask.java:149)
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:452)
> at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:314)
> at java.util.concurrent.FutureTask.run(FutureTask.java:149)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:897)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:919)
> at java.lang.Thread.run(Thread.java:738)
>
> Any way to setup SolrCloud to write index to local file system, while
> allowing the Solr MapReduceIndexerTool's GoLive to merge index generated on
> HDFS to the SolrCloud?
>
> Thanks,
> Tom

Reply via email to