Re: solr over hdfs for accessing/ changing indexes outside solr
Thank you very much. But why we should go for solr distributed with hadoop? There is already solrCloud which is pretty applicable in the case of big index. Is there any advantage for sending indexes over map reduce that solrCloud can not provide? Regards. On Wed, Aug 6, 2014 at 9:09 PM, Erick Erickson erickerick...@gmail.com wrote: bq: Are you aware of Cloudera search? I know they provide an integrated Hadoop ecosystem. What Cloudera Search does via the MapReduceIndexerTool (MRIT) is create N sub-indexes for each shard in the M/R paradigm via EmbeddedSolrServer. Eventually, these sub-indexes for each shard are merged (perhaps through some number of levels) in the reduce phase and maybe merged into a live Solr instance (--go-live). You'll note that this tool requires the address of the ZK ensemble from which it can get the network topology, configuration files, all that rot. If you don't use the --go-live option, the output is still a Solr index, it's just that the index for each shard is left in a specific directory on HDFS. Being on HDFS allows this kind of M/R paradigm for massively parallel indexing operations, and perhaps massively complex analysis. Nowhere is there any low-level non-Solr manipulation of the indexes. The Flume fork just writes directly to the Solr nodes. It knows about the ZooKeeper ensemble and the collection too and communicates via SolrJ I'm pretty sure. As far as integrating with HDFS, you're right, HA is part of the package. As far as using the Solr indexes for analysis, well you can write anything you want to use the Solr indexes from anywhere in the M/R world and have them available from anywhere in the cluster. There's no real need to even have Solr running, you could use the output from MRIT and access the sub-shards with the EmbeddedSolrServer if you wanted, leaving out all the pesky servlet container stuff. bq: So why we go for HDFS in the case of analysis if we want to use SolrJ for this purpose? What is the point? Scale and data access in a nutshell. In the HDFS world, you can scale pretty linearly with the number of nodes you can rack together. Frankly though, if your data set is small enough to fit on a single machine _and_ you can get through your analysis in a reasonable time (reasonable here is up to you), then HDFS is probably not worth the hassle. But in the big data world where we're talking petabyte scale, having HDFS as the underpinning opens up possibilities for working on data that were difficult/impossible with Solr previously. Best, Erick On Tue, Aug 5, 2014 at 9:37 PM, Ali Nazemian alinazem...@gmail.com wrote: Dear Erick, I remembered some times ago, somebody asked about what is the point of modify Solr to use HDFS for storing indexes. As far as I remember somebody told him integrating Solr with HDFS has two advantages. 1) having hadoop replication and HA. 2) using indexes and Solr documents for other purposes such as Analysis. So why we go for HDFS in the case of analysis if we want to use SolrJ for this purpose? What is the point? Regards. On Wed, Aug 6, 2014 at 8:59 AM, Ali Nazemian alinazem...@gmail.com wrote: Dear Erick, Hi, Thank you for you reply. Yeah I am aware that SolrJ is my last option. I was thinking about raw I/O operation. So according to your reply probably it is not applicable somehow. What about the Lily project that Michael mentioned? Is that consider SolrJ too? Are you aware of Cloudera search? I know they provide an integrated Hadoop ecosystem. Do you know what is their suggestion? Best regards. On Wed, Aug 6, 2014 at 12:28 AM, Erick Erickson erickerick...@gmail.com wrote: What you haven't told us is what you mean by modify the index outside Solr. SolrJ? Using raw Lucene? Trying to modify things by writing your own codec? Standard Java I/O operations? Other? You could use SolrJ to connect to an existing Solr server and both read and modify at will form your M/R jobs. But if you're thinking of trying to write/modify the segment files by raw I/O operations, good luck! I'm 99.99% certain that's going to cause you endless grief. Best, Erick On Tue, Aug 5, 2014 at 9:55 AM, Ali Nazemian alinazem...@gmail.com wrote: Actually I am going to do some analysis on the solr data using map reduce. For this purpose it might be needed to change some part of data or add new fields from outside solr. On Tue, Aug 5, 2014 at 5:51 PM, Shawn Heisey s...@elyograg.org wrote: On 8/5/2014 7:04 AM, Ali Nazemian wrote: I changed solr 4.9 to write index and data on hdfs. Now I am going to connect to those data from the outside of solr for changing some of the values. Could somebody please tell me how that is possible? Suppose I am using Hbase over hdfs for do these changes.
Re: solr over hdfs for accessing/ changing indexes outside solr
If SolrCloud meets your needs, without Hadoop, then there's no real reason to introduce the added complexity. There are a bunch of problems that do _not_ work well with SolrCloud over non-Hadoop file systems. For those problems, the combination of SolrCloud and Hadoop make tackling them possible. Best, Erick On Thu, Aug 7, 2014 at 3:55 AM, Ali Nazemian alinazem...@gmail.com wrote: Thank you very much. But why we should go for solr distributed with hadoop? There is already solrCloud which is pretty applicable in the case of big index. Is there any advantage for sending indexes over map reduce that solrCloud can not provide? Regards. On Wed, Aug 6, 2014 at 9:09 PM, Erick Erickson erickerick...@gmail.com wrote: bq: Are you aware of Cloudera search? I know they provide an integrated Hadoop ecosystem. What Cloudera Search does via the MapReduceIndexerTool (MRIT) is create N sub-indexes for each shard in the M/R paradigm via EmbeddedSolrServer. Eventually, these sub-indexes for each shard are merged (perhaps through some number of levels) in the reduce phase and maybe merged into a live Solr instance (--go-live). You'll note that this tool requires the address of the ZK ensemble from which it can get the network topology, configuration files, all that rot. If you don't use the --go-live option, the output is still a Solr index, it's just that the index for each shard is left in a specific directory on HDFS. Being on HDFS allows this kind of M/R paradigm for massively parallel indexing operations, and perhaps massively complex analysis. Nowhere is there any low-level non-Solr manipulation of the indexes. The Flume fork just writes directly to the Solr nodes. It knows about the ZooKeeper ensemble and the collection too and communicates via SolrJ I'm pretty sure. As far as integrating with HDFS, you're right, HA is part of the package. As far as using the Solr indexes for analysis, well you can write anything you want to use the Solr indexes from anywhere in the M/R world and have them available from anywhere in the cluster. There's no real need to even have Solr running, you could use the output from MRIT and access the sub-shards with the EmbeddedSolrServer if you wanted, leaving out all the pesky servlet container stuff. bq: So why we go for HDFS in the case of analysis if we want to use SolrJ for this purpose? What is the point? Scale and data access in a nutshell. In the HDFS world, you can scale pretty linearly with the number of nodes you can rack together. Frankly though, if your data set is small enough to fit on a single machine _and_ you can get through your analysis in a reasonable time (reasonable here is up to you), then HDFS is probably not worth the hassle. But in the big data world where we're talking petabyte scale, having HDFS as the underpinning opens up possibilities for working on data that were difficult/impossible with Solr previously. Best, Erick On Tue, Aug 5, 2014 at 9:37 PM, Ali Nazemian alinazem...@gmail.com wrote: Dear Erick, I remembered some times ago, somebody asked about what is the point of modify Solr to use HDFS for storing indexes. As far as I remember somebody told him integrating Solr with HDFS has two advantages. 1) having hadoop replication and HA. 2) using indexes and Solr documents for other purposes such as Analysis. So why we go for HDFS in the case of analysis if we want to use SolrJ for this purpose? What is the point? Regards. On Wed, Aug 6, 2014 at 8:59 AM, Ali Nazemian alinazem...@gmail.com wrote: Dear Erick, Hi, Thank you for you reply. Yeah I am aware that SolrJ is my last option. I was thinking about raw I/O operation. So according to your reply probably it is not applicable somehow. What about the Lily project that Michael mentioned? Is that consider SolrJ too? Are you aware of Cloudera search? I know they provide an integrated Hadoop ecosystem. Do you know what is their suggestion? Best regards. On Wed, Aug 6, 2014 at 12:28 AM, Erick Erickson erickerick...@gmail.com wrote: What you haven't told us is what you mean by modify the index outside Solr. SolrJ? Using raw Lucene? Trying to modify things by writing your own codec? Standard Java I/O operations? Other? You could use SolrJ to connect to an existing Solr server and both read and modify at will form your M/R jobs. But if you're thinking of trying to write/modify the segment files by raw I/O operations, good luck! I'm 99.99% certain that's going to cause you endless grief. Best, Erick On Tue, Aug 5, 2014 at 9:55 AM, Ali Nazemian alinazem...@gmail.com wrote: Actually I am going to do some analysis on the solr data using map reduce. For this
Re: solr over hdfs for accessing/ changing indexes outside solr
Dear Erick, Could you please name those problems that SolrCloud can not tackle them alone? Maybe I need solrCloud+ Hadoop and I am not aware of that yet. Regards. On Thu, Aug 7, 2014 at 7:37 PM, Erick Erickson erickerick...@gmail.com wrote: If SolrCloud meets your needs, without Hadoop, then there's no real reason to introduce the added complexity. There are a bunch of problems that do _not_ work well with SolrCloud over non-Hadoop file systems. For those problems, the combination of SolrCloud and Hadoop make tackling them possible. Best, Erick On Thu, Aug 7, 2014 at 3:55 AM, Ali Nazemian alinazem...@gmail.com wrote: Thank you very much. But why we should go for solr distributed with hadoop? There is already solrCloud which is pretty applicable in the case of big index. Is there any advantage for sending indexes over map reduce that solrCloud can not provide? Regards. On Wed, Aug 6, 2014 at 9:09 PM, Erick Erickson erickerick...@gmail.com wrote: bq: Are you aware of Cloudera search? I know they provide an integrated Hadoop ecosystem. What Cloudera Search does via the MapReduceIndexerTool (MRIT) is create N sub-indexes for each shard in the M/R paradigm via EmbeddedSolrServer. Eventually, these sub-indexes for each shard are merged (perhaps through some number of levels) in the reduce phase and maybe merged into a live Solr instance (--go-live). You'll note that this tool requires the address of the ZK ensemble from which it can get the network topology, configuration files, all that rot. If you don't use the --go-live option, the output is still a Solr index, it's just that the index for each shard is left in a specific directory on HDFS. Being on HDFS allows this kind of M/R paradigm for massively parallel indexing operations, and perhaps massively complex analysis. Nowhere is there any low-level non-Solr manipulation of the indexes. The Flume fork just writes directly to the Solr nodes. It knows about the ZooKeeper ensemble and the collection too and communicates via SolrJ I'm pretty sure. As far as integrating with HDFS, you're right, HA is part of the package. As far as using the Solr indexes for analysis, well you can write anything you want to use the Solr indexes from anywhere in the M/R world and have them available from anywhere in the cluster. There's no real need to even have Solr running, you could use the output from MRIT and access the sub-shards with the EmbeddedSolrServer if you wanted, leaving out all the pesky servlet container stuff. bq: So why we go for HDFS in the case of analysis if we want to use SolrJ for this purpose? What is the point? Scale and data access in a nutshell. In the HDFS world, you can scale pretty linearly with the number of nodes you can rack together. Frankly though, if your data set is small enough to fit on a single machine _and_ you can get through your analysis in a reasonable time (reasonable here is up to you), then HDFS is probably not worth the hassle. But in the big data world where we're talking petabyte scale, having HDFS as the underpinning opens up possibilities for working on data that were difficult/impossible with Solr previously. Best, Erick On Tue, Aug 5, 2014 at 9:37 PM, Ali Nazemian alinazem...@gmail.com wrote: Dear Erick, I remembered some times ago, somebody asked about what is the point of modify Solr to use HDFS for storing indexes. As far as I remember somebody told him integrating Solr with HDFS has two advantages. 1) having hadoop replication and HA. 2) using indexes and Solr documents for other purposes such as Analysis. So why we go for HDFS in the case of analysis if we want to use SolrJ for this purpose? What is the point? Regards. On Wed, Aug 6, 2014 at 8:59 AM, Ali Nazemian alinazem...@gmail.com wrote: Dear Erick, Hi, Thank you for you reply. Yeah I am aware that SolrJ is my last option. I was thinking about raw I/O operation. So according to your reply probably it is not applicable somehow. What about the Lily project that Michael mentioned? Is that consider SolrJ too? Are you aware of Cloudera search? I know they provide an integrated Hadoop ecosystem. Do you know what is their suggestion? Best regards. On Wed, Aug 6, 2014 at 12:28 AM, Erick Erickson erickerick...@gmail.com wrote: What you haven't told us is what you mean by modify the index outside Solr. SolrJ? Using raw Lucene? Trying to modify things by writing your own codec? Standard Java I/O operations? Other? You could use SolrJ to connect to an existing Solr server and both read and modify at will form your M/R
Re: solr over hdfs for accessing/ changing indexes outside solr
bq: Are you aware of Cloudera search? I know they provide an integrated Hadoop ecosystem. What Cloudera Search does via the MapReduceIndexerTool (MRIT) is create N sub-indexes for each shard in the M/R paradigm via EmbeddedSolrServer. Eventually, these sub-indexes for each shard are merged (perhaps through some number of levels) in the reduce phase and maybe merged into a live Solr instance (--go-live). You'll note that this tool requires the address of the ZK ensemble from which it can get the network topology, configuration files, all that rot. If you don't use the --go-live option, the output is still a Solr index, it's just that the index for each shard is left in a specific directory on HDFS. Being on HDFS allows this kind of M/R paradigm for massively parallel indexing operations, and perhaps massively complex analysis. Nowhere is there any low-level non-Solr manipulation of the indexes. The Flume fork just writes directly to the Solr nodes. It knows about the ZooKeeper ensemble and the collection too and communicates via SolrJ I'm pretty sure. As far as integrating with HDFS, you're right, HA is part of the package. As far as using the Solr indexes for analysis, well you can write anything you want to use the Solr indexes from anywhere in the M/R world and have them available from anywhere in the cluster. There's no real need to even have Solr running, you could use the output from MRIT and access the sub-shards with the EmbeddedSolrServer if you wanted, leaving out all the pesky servlet container stuff. bq: So why we go for HDFS in the case of analysis if we want to use SolrJ for this purpose? What is the point? Scale and data access in a nutshell. In the HDFS world, you can scale pretty linearly with the number of nodes you can rack together. Frankly though, if your data set is small enough to fit on a single machine _and_ you can get through your analysis in a reasonable time (reasonable here is up to you), then HDFS is probably not worth the hassle. But in the big data world where we're talking petabyte scale, having HDFS as the underpinning opens up possibilities for working on data that were difficult/impossible with Solr previously. Best, Erick On Tue, Aug 5, 2014 at 9:37 PM, Ali Nazemian alinazem...@gmail.com wrote: Dear Erick, I remembered some times ago, somebody asked about what is the point of modify Solr to use HDFS for storing indexes. As far as I remember somebody told him integrating Solr with HDFS has two advantages. 1) having hadoop replication and HA. 2) using indexes and Solr documents for other purposes such as Analysis. So why we go for HDFS in the case of analysis if we want to use SolrJ for this purpose? What is the point? Regards. On Wed, Aug 6, 2014 at 8:59 AM, Ali Nazemian alinazem...@gmail.com wrote: Dear Erick, Hi, Thank you for you reply. Yeah I am aware that SolrJ is my last option. I was thinking about raw I/O operation. So according to your reply probably it is not applicable somehow. What about the Lily project that Michael mentioned? Is that consider SolrJ too? Are you aware of Cloudera search? I know they provide an integrated Hadoop ecosystem. Do you know what is their suggestion? Best regards. On Wed, Aug 6, 2014 at 12:28 AM, Erick Erickson erickerick...@gmail.com wrote: What you haven't told us is what you mean by modify the index outside Solr. SolrJ? Using raw Lucene? Trying to modify things by writing your own codec? Standard Java I/O operations? Other? You could use SolrJ to connect to an existing Solr server and both read and modify at will form your M/R jobs. But if you're thinking of trying to write/modify the segment files by raw I/O operations, good luck! I'm 99.99% certain that's going to cause you endless grief. Best, Erick On Tue, Aug 5, 2014 at 9:55 AM, Ali Nazemian alinazem...@gmail.com wrote: Actually I am going to do some analysis on the solr data using map reduce. For this purpose it might be needed to change some part of data or add new fields from outside solr. On Tue, Aug 5, 2014 at 5:51 PM, Shawn Heisey s...@elyograg.org wrote: On 8/5/2014 7:04 AM, Ali Nazemian wrote: I changed solr 4.9 to write index and data on hdfs. Now I am going to connect to those data from the outside of solr for changing some of the values. Could somebody please tell me how that is possible? Suppose I am using Hbase over hdfs for do these changes. I don't know how you could safely modify the index without a Lucene application or another instance of Solr, but if you do manage to modify the index, simply reloading the core or restarting Solr should cause it to pick up the changes. Either you would need to make sure that Solr never modifies the index, or you would need some way of coordinating updates so that Solr and the other application would never try to modify the index at
solr over hdfs for accessing/ changing indexes outside solr
Dear all, Hi, I changed solr 4.9 to write index and data on hdfs. Now I am going to connect to those data from the outside of solr for changing some of the values. Could somebody please tell me how that is possible? Suppose I am using Hbase over hdfs for do these changes. Best regards. -- A.Nazemian
Re: solr over hdfs for accessing/ changing indexes outside solr
On 8/5/2014 7:04 AM, Ali Nazemian wrote: I changed solr 4.9 to write index and data on hdfs. Now I am going to connect to those data from the outside of solr for changing some of the values. Could somebody please tell me how that is possible? Suppose I am using Hbase over hdfs for do these changes. I don't know how you could safely modify the index without a Lucene application or another instance of Solr, but if you do manage to modify the index, simply reloading the core or restarting Solr should cause it to pick up the changes. Either you would need to make sure that Solr never modifies the index, or you would need some way of coordinating updates so that Solr and the other application would never try to modify the index at the same time. Thanks, Shawn
Re: solr over hdfs for accessing/ changing indexes outside solr
Probably the most correct way to modify the index would be to use the Solr REST API to push your changes out. Another thing you might want to look at is Lilly. Basically it's a way to set up a Solr collection as an HBase replication target, so changes to your HBase table would automatically propagate over to Solr. http://www.ngdata.com/on-lily-hbase-hadoop-and-solr/ Michael Della Bitta Applications Developer o: +1 646 532 3062 appinions inc. “The Science of Influence Marketing” 18 East 41st Street New York, NY 10017 t: @appinions https://twitter.com/Appinions | g+: plus.google.com/appinions https://plus.google.com/u/0/b/112002776285509593336/112002776285509593336/posts w: appinions.com http://www.appinions.com/ On Tue, Aug 5, 2014 at 9:04 AM, Ali Nazemian alinazem...@gmail.com wrote: Dear all, Hi, I changed solr 4.9 to write index and data on hdfs. Now I am going to connect to those data from the outside of solr for changing some of the values. Could somebody please tell me how that is possible? Suppose I am using Hbase over hdfs for do these changes. Best regards. -- A.Nazemian
Re: solr over hdfs for accessing/ changing indexes outside solr
Actually I am going to do some analysis on the solr data using map reduce. For this purpose it might be needed to change some part of data or add new fields from outside solr. On Tue, Aug 5, 2014 at 5:51 PM, Shawn Heisey s...@elyograg.org wrote: On 8/5/2014 7:04 AM, Ali Nazemian wrote: I changed solr 4.9 to write index and data on hdfs. Now I am going to connect to those data from the outside of solr for changing some of the values. Could somebody please tell me how that is possible? Suppose I am using Hbase over hdfs for do these changes. I don't know how you could safely modify the index without a Lucene application or another instance of Solr, but if you do manage to modify the index, simply reloading the core or restarting Solr should cause it to pick up the changes. Either you would need to make sure that Solr never modifies the index, or you would need some way of coordinating updates so that Solr and the other application would never try to modify the index at the same time. Thanks, Shawn -- A.Nazemian
Re: solr over hdfs for accessing/ changing indexes outside solr
What you haven't told us is what you mean by modify the index outside Solr. SolrJ? Using raw Lucene? Trying to modify things by writing your own codec? Standard Java I/O operations? Other? You could use SolrJ to connect to an existing Solr server and both read and modify at will form your M/R jobs. But if you're thinking of trying to write/modify the segment files by raw I/O operations, good luck! I'm 99.99% certain that's going to cause you endless grief. Best, Erick On Tue, Aug 5, 2014 at 9:55 AM, Ali Nazemian alinazem...@gmail.com wrote: Actually I am going to do some analysis on the solr data using map reduce. For this purpose it might be needed to change some part of data or add new fields from outside solr. On Tue, Aug 5, 2014 at 5:51 PM, Shawn Heisey s...@elyograg.org wrote: On 8/5/2014 7:04 AM, Ali Nazemian wrote: I changed solr 4.9 to write index and data on hdfs. Now I am going to connect to those data from the outside of solr for changing some of the values. Could somebody please tell me how that is possible? Suppose I am using Hbase over hdfs for do these changes. I don't know how you could safely modify the index without a Lucene application or another instance of Solr, but if you do manage to modify the index, simply reloading the core or restarting Solr should cause it to pick up the changes. Either you would need to make sure that Solr never modifies the index, or you would need some way of coordinating updates so that Solr and the other application would never try to modify the index at the same time. Thanks, Shawn -- A.Nazemian
Re: solr over hdfs for accessing/ changing indexes outside solr
Dear Erick, Hi, Thank you for you reply. Yeah I am aware that SolrJ is my last option. I was thinking about raw I/O operation. So according to your reply probably it is not applicable somehow. What about the Lily project that Michael mentioned? Is that consider SolrJ too? Are you aware of Cloudera search? I know they provide an integrated Hadoop ecosystem. Do you know what is their suggestion? Best regards. On Wed, Aug 6, 2014 at 12:28 AM, Erick Erickson erickerick...@gmail.com wrote: What you haven't told us is what you mean by modify the index outside Solr. SolrJ? Using raw Lucene? Trying to modify things by writing your own codec? Standard Java I/O operations? Other? You could use SolrJ to connect to an existing Solr server and both read and modify at will form your M/R jobs. But if you're thinking of trying to write/modify the segment files by raw I/O operations, good luck! I'm 99.99% certain that's going to cause you endless grief. Best, Erick On Tue, Aug 5, 2014 at 9:55 AM, Ali Nazemian alinazem...@gmail.com wrote: Actually I am going to do some analysis on the solr data using map reduce. For this purpose it might be needed to change some part of data or add new fields from outside solr. On Tue, Aug 5, 2014 at 5:51 PM, Shawn Heisey s...@elyograg.org wrote: On 8/5/2014 7:04 AM, Ali Nazemian wrote: I changed solr 4.9 to write index and data on hdfs. Now I am going to connect to those data from the outside of solr for changing some of the values. Could somebody please tell me how that is possible? Suppose I am using Hbase over hdfs for do these changes. I don't know how you could safely modify the index without a Lucene application or another instance of Solr, but if you do manage to modify the index, simply reloading the core or restarting Solr should cause it to pick up the changes. Either you would need to make sure that Solr never modifies the index, or you would need some way of coordinating updates so that Solr and the other application would never try to modify the index at the same time. Thanks, Shawn -- A.Nazemian -- A.Nazemian
Re: solr over hdfs for accessing/ changing indexes outside solr
Dear Erick, I remembered some times ago, somebody asked about what is the point of modify Solr to use HDFS for storing indexes. As far as I remember somebody told him integrating Solr with HDFS has two advantages. 1) having hadoop replication and HA. 2) using indexes and Solr documents for other purposes such as Analysis. So why we go for HDFS in the case of analysis if we want to use SolrJ for this purpose? What is the point? Regards. On Wed, Aug 6, 2014 at 8:59 AM, Ali Nazemian alinazem...@gmail.com wrote: Dear Erick, Hi, Thank you for you reply. Yeah I am aware that SolrJ is my last option. I was thinking about raw I/O operation. So according to your reply probably it is not applicable somehow. What about the Lily project that Michael mentioned? Is that consider SolrJ too? Are you aware of Cloudera search? I know they provide an integrated Hadoop ecosystem. Do you know what is their suggestion? Best regards. On Wed, Aug 6, 2014 at 12:28 AM, Erick Erickson erickerick...@gmail.com wrote: What you haven't told us is what you mean by modify the index outside Solr. SolrJ? Using raw Lucene? Trying to modify things by writing your own codec? Standard Java I/O operations? Other? You could use SolrJ to connect to an existing Solr server and both read and modify at will form your M/R jobs. But if you're thinking of trying to write/modify the segment files by raw I/O operations, good luck! I'm 99.99% certain that's going to cause you endless grief. Best, Erick On Tue, Aug 5, 2014 at 9:55 AM, Ali Nazemian alinazem...@gmail.com wrote: Actually I am going to do some analysis on the solr data using map reduce. For this purpose it might be needed to change some part of data or add new fields from outside solr. On Tue, Aug 5, 2014 at 5:51 PM, Shawn Heisey s...@elyograg.org wrote: On 8/5/2014 7:04 AM, Ali Nazemian wrote: I changed solr 4.9 to write index and data on hdfs. Now I am going to connect to those data from the outside of solr for changing some of the values. Could somebody please tell me how that is possible? Suppose I am using Hbase over hdfs for do these changes. I don't know how you could safely modify the index without a Lucene application or another instance of Solr, but if you do manage to modify the index, simply reloading the core or restarting Solr should cause it to pick up the changes. Either you would need to make sure that Solr never modifies the index, or you would need some way of coordinating updates so that Solr and the other application would never try to modify the index at the same time. Thanks, Shawn -- A.Nazemian -- A.Nazemian -- A.Nazemian