Re: phrase extraction from user paragraph input

2014-11-28 Thread Vineet Mishra
Hi Nokos,

Can you quote an example for your usecase, I guess that will be helpful for
understanding the problem more clearly.

Cheers!

On Fri, Nov 28, 2014 at 2:31 PM, Nikos Chaliasos nchal...@cs.uoi.gr wrote:

 Hello,

 I am investigating a university project where in a part of it, the user
 would give a paragraph of text as input and the parsing process (after
 removing stopwords) would extract a series of descriptive topics about the
 paragraph, with which I could then search in documents for results.
 Is there any available bibliography/source that could help me start with? I
 am very new to solr/lucene and I couldn't find anything similar to what I
 am thinking.

 Thank you,

 Nikos Chaliasos



Re: Inconsistent Behavior of Solr Cloud

2014-06-16 Thread Vineet Mishra
Hi Erick,

Thanks for your response, well I got it resolved. I think the index were
not properly distributed and moreover I had some uneven behavior while
indexing, so to elaborate it,

I had three shards in my collection, I started indexing with
EmbeddedSolrServer and indexed around 50 Million Documents(15 GB index size
without replication), there after I indexed another 50 Million to different
directory for next Shard but when I checked the stats of indexing next
day(probably running after 15 hrs or so) it was still running and the index
size was grown to 60 GB(I didn't understood why such a huge disk allocation
had taken place even for the same amount of 50 Million data I indexed
previously), eventually I stopped the process as I couldn't get better
updates and copied the indexes to the next Shard.

#When I queried later with *:* I got the response as 69 Million
documents(which was supposed to be 100 Million).

##I am not sure where another 30 Million was gone, but the problem started
coming once after I again indexed to next Shard with remaining 30 Million
which was not coming in querying #.

I have read somewhere consistency of the cloud is broken if different
shards are holding the value for same UniqueID field.

With this I got few things to clarify.
*Does the inconsistency behavior was because of the step I took at ## ?
*If the inconsistency was because of ## then why all 100 Million documents
was not present after # ?
*When the same set of data was previously indexed with just 15 GB, why the
index size for next 50 Million was grown to 60 GB?
*For indexing huge data in reasonable time for SolrCloud what approach
should be taken, if EmbeddedSolrServer is not better choice?

Looking out for response.

Thanks!


On Sat, Jun 14, 2014 at 12:31 AM, Erick Erickson erickerick...@gmail.com
wrote:

 It seems like for some reason you have shards that are not reachable.
 What does your cloud stat in the admin UI tell you when you don't get
 all the docs back?

 Best,
 Erick

 On Fri, Jun 13, 2014 at 1:37 AM, Vineet Mishra clearmido...@gmail.com
 wrote:
  Hi All,
 
  I am having a Cloud setup with 3 Shards and 2 Replica running on 3
 Tomcats
  with 3 External Zookeeper, all running on single machine.
  I have Indexed around 70 Mln Documents that seems to be querying back
 fine.
  When I index another 30 Mln to same, the result are vague as with the
 query
  *:* its sometimes returning 2 Shards result and sometime all the shards
  result.
  So to make it clear if I query with *:* to the 100Mln index its should
  return back 100Mln docs, but sometimes its returning 70Mln and sometimes
  100Mln(Actual Result) with the same query.
 
  This is just not case with the *:* query but even if I query with the id
 
  q=id:123
 
  its sometimes coming with the result and sometimes not.
 
  Looking for possible solution.
 
  Thanks!



Inconsistent Behavior of Solr Cloud

2014-06-13 Thread Vineet Mishra
Hi All,

I am having a Cloud setup with 3 Shards and 2 Replica running on 3 Tomcats
with 3 External Zookeeper, all running on single machine.
I have Indexed around 70 Mln Documents that seems to be querying back fine.
When I index another 30 Mln to same, the result are vague as with the query
*:* its sometimes returning 2 Shards result and sometime all the shards
result.
So to make it clear if I query with *:* to the 100Mln index its should
return back 100Mln docs, but sometimes its returning 70Mln and sometimes
100Mln(Actual Result) with the same query.

This is just not case with the *:* query but even if I query with the id

q=id:123

its sometimes coming with the result and sometimes not.

Looking for possible solution.

Thanks!


Re: Collection communication internally

2014-06-10 Thread Vineet Mishra
Then are there some other alternative so that we can achieve the goal. As
querying with this way of set of foreign id is really going to make the
query very large and the response is also awaited for
long(previously tested with the standalone Solr core with Master Slave
Architecture).

Thanks!




On Mon, Jun 9, 2014 at 8:42 PM, Erick Erickson erickerick...@gmail.com
wrote:

 My first answer is don't do it that way :).

 Solr works best with flattened (de-normlized) data. If at all
 possible, you _really_ would be better off combining the two
 collections and flattening the data even though there would be more
 data.

 Whenever I see a question like this, I wonder if you're trying to use
 Solr like a DB, in this case with collections substituting for
 tables, and this is almost always a mistake.

 If you really must do this, consider cross-core joins if at all
 possible, but I don't think this is supported yet for distributed
 setups.

 Best,
 Erick

 On Mon, Jun 9, 2014 at 7:32 AM, Vineet Mishra clearmido...@gmail.com
 wrote:
  Hi All,
 
  I was curious to know how multiple Collection communication be achieved?
 If
  yes then by what means.
 
  The use case says, having multiple collection I need to query the first
  collection and get the unique ids from first collection to query the
 second
  one(Foreign Key Relation). Now if the no. of terms to be passed to second
  collection is relatively small then its fine otherwise the problem arise,
  as adding them to the query is little time consuming in sense of building
  the query, querying to solr and waiting for the result to respond back.
 
  So the query would look something like -
 
  http://localhost:7070/solr/mycollection/select?q=
  
 http://localhost:7070/solr/recollection/select?q=*:*fl=idsort=id_S%20desc
 ID:(
  1 OR 2 OR ... OR 10)fl=*
 
  So for the above form of query where the query terms are expanding
  vigorously I was looking out for some solution where the collections can
  internally resolve the query and fetch the resultant output.
 
  Thanks!



Re: solr4 optimization

2014-06-10 Thread Vineet Mishra
As Otis mentioned, its obviously good to run Optimization once in a while
or when you are done with most of your heavy indexing operation. Its not
concern with the Disk Capacity rather with the IO and seeking in segements,
When comparably it has less segments to query the IO operation will be less
and so quick will be your query response.

Give it a go and come up with the stats.

Cheers!


On Tue, Jun 10, 2014 at 1:54 AM, Otis Gospodnetic 
otis.gospodne...@gmail.com wrote:

 Hi,

 I don't remember last time I ran optimize.  Sure, yes, things will work
 faster if you optimize an index and reduce the number of segments, but if
 you are regularly writing to that index and performance is OK, leave it to
 Lucene segment merges to purge deletes.

 Otis
 --
 Performance Monitoring * Log Analytics * Search Analytics
 Solr  Elasticsearch Support * http://sematext.com/


 On Mon, Jun 9, 2014 at 4:15 PM, Joshi, Shital shital.jo...@gs.com wrote:

   Hi,
 
  We have SolrCloud cluster (5 shards and 2 replicas) on 10 boxes. On some
  of the boxes we have about 5 million deleted docs and we have never run
  optimization since beginning. Does number of deleted docs have anything
 to
  do with performance of query? Should we consider optimization at all if
  we're not worried about disk space?
 
  Thanks!
 
 
 



Collection communication internally

2014-06-09 Thread Vineet Mishra
Hi All,

I was curious to know how multiple Collection communication be achieved? If
yes then by what means.

The use case says, having multiple collection I need to query the first
collection and get the unique ids from first collection to query the second
one(Foreign Key Relation). Now if the no. of terms to be passed to second
collection is relatively small then its fine otherwise the problem arise,
as adding them to the query is little time consuming in sense of building
the query, querying to solr and waiting for the result to respond back.

So the query would look something like -

http://localhost:7070/solr/mycollection/select?q=
http://localhost:7070/solr/recollection/select?q=*:*fl=idsort=id_S%20descID:(
1 OR 2 OR ... OR 10)fl=*

So for the above form of query where the query terms are expanding
vigorously I was looking out for some solution where the collections can
internally resolve the query and fetch the resultant output.

Thanks!


Re: Solr maximum Optimal Index Size per Shard

2014-06-06 Thread Vineet Mishra
Hi Shawn,

Thanks for your response, wanted to clarify a few things.

*Does that mean for querying smoothly we need to have memory atleast equal
or greater to the size of index? As in my case the index size will be very
heavy(~2TB) and practically speaking that amount of memory is not possible.
Even If it goes to multiple shards, say around 10 Shards then also 200GB of
RAM will not be an feasible option.

*With CloudSolrServer can we specify which Shard the particular index
should go and reside, which I can do with EmbeddedSolrServer by indexing in
different directories and moving them to appropriate shard directories.

Thanks!



On Wed, Jun 4, 2014 at 12:43 PM, Shawn Heisey s...@elyograg.org wrote:

 On 6/4/2014 12:45 AM, Vineet Mishra wrote:
  Thanks all for your response.
  I presume this conversation concludes that indexing around 1Billion
  documents per shard won't be a problem, as I have 10 Billion docs to
 index,
  so approx 10 shards with 1 Billion each should be fine with it and how
  about Memory, what size of RAM should be fine for this amount of data?

 Figure out the heap requirements of the operating system and every
 program on the machine (Solr especially).  Then you would add that
 number to the total size of the index data on the machine.  That is the
 ideal minimum RAM.

 http://wiki.apache.org/solr/SolrPerformanceProblems

 Unfortunately, if you are dealing with a huge index with billions of
 documents, it is likely to be prohibitively expensive to buy that much
 RAM.  If you are running Solr on Amazon's cloud, the cost for that much
 RAM would be astronomical.

 Exactly how much RAM would actually be required is very difficult to
 predict.  If you had only 25% of the ideal, your index might have
 perfectly acceptable performance, or it might not.  It might do fine
 under a light query load, but if you increase to 50 queries per second,
 performance may drop significantly ... or it might be good.  It's
 generally not possible to know how your hardware will perform until you
 actually build and use your index.


 http://searchhub.org/2012/07/23/sizing-hardware-in-the-abstract-why-we-dont-have-a-definitive-answer/

 A general rule of thumb for RAM that I have found to be useful is that
 if you've got less than half of the ideal memory size, you might have
 performance problems.

  Moreover what should be the indexing technique for this huge data set, as
  currently I am indexing with EmbeddedSolrServer but its going
 pathetically
  slow after some 20Gb of indexing. Comparatively SolrHttpPost was slow due
  to network delays and response but after this long running the indexing
  with EmbeddedSolrServer I am getting a different notion.
  Any good indexing technique for this huge dataset would be highly
  appreciated.

 EmbeddedSolrServer is not recommended.  Run Solr in the traditional way
 with HTTP connectivity.  HTTP overhead on a LAN is usually quite small.
  Solr is fully thread-safe, so you can have several indexing threads all
 going at the same time.

 Indexes at this scale should normally be built with SolrCloud, with
 enough servers so that each machine is only handling one shard replica.
  The ideal indexing program would be written in Java, using
 CloudSolrServer.

 Thanks,
 Shawn




Re: Solr maximum Optimal Index Size per Shard

2014-06-06 Thread Vineet Mishra
Hey Jack,

Well I have indexed around some 10 Million documents consuming 20 GB index
size.
Each Document is consisting of nearly 100 String Fields with data upto 10
characters per field.
For my case each document containing number of fields can expand much
widely (from current 100 to 500 or ever more).

As for the typical exceptional case I was more interested for a way to
evenly maintain the right ratio of index vs shard.

Thanks!


On Wed, Jun 4, 2014 at 7:47 PM, Jack Krupansky j...@basetechnology.com
wrote:

 How many documents was in that 20GB index?

 I'm skeptical that a 1 billion document shard won't be a problem. I mean
 technically it is possible, but as you are already experiencing, it may
 take a long time and a very powerful machine to do so. 100 million (or 250
 million max) would be a more realistic goal. Even then, it depends on your
 doc size and machine size.

 The main point from the previous discussion is that although the technical
 hard limit for a Solr shard is 2G docs, from a practical perspective it is
 very difficult to get to that limit, not that indexing 1 billion docs on a
 single shard is just fine!

 As a general rule, if you want fast queries for high volume, strive to
 assure that your per-shard index fits entirely into the system memory
 available for OS caching of file system pages.

 In any case, a proof of concept implementation will tell you everything
 you need to know.


 -- Jack Krupansky

 -Original Message- From: Vineet Mishra
 Sent: Wednesday, June 4, 2014 2:45 AM
 To: solr-user@lucene.apache.org
 Subject: Re: Solr maximum Optimal Index Size per Shard


 Thanks all for your response.
 I presume this conversation concludes that indexing around 1Billion
 documents per shard won't be a problem, as I have 10 Billion docs to index,
 so approx 10 shards with 1 Billion each should be fine with it and how
 about Memory, what size of RAM should be fine for this amount of data?
 Moreover what should be the indexing technique for this huge data set, as
 currently I am indexing with EmbeddedSolrServer but its going pathetically
 slow after some 20Gb of indexing. Comparatively SolrHttpPost was slow due
 to network delays and response but after this long running the indexing
 with EmbeddedSolrServer I am getting a different notion.
 Any good indexing technique for this huge dataset would be highly
 appreciated.

 Thanks again!


 On Wed, Jun 4, 2014 at 6:40 AM, rulinma ruli...@gmail.com wrote:

  mark.



 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/Solr-maximum-
 Optimal-Index-Size-per-Shard-tp4139565p4139698.html
 Sent from the Solr - User mailing list archive at Nabble.com.





Re: Solr maximum Optimal Index Size per Shard

2014-06-06 Thread Vineet Mishra
Hi Toke,

That was Spectacular, really great to hear that you have already indexed
2.7TB+ data to your server and still the query response time is under ms or
a few seconds for such a huge dataset.
Could you state what indexing mechanism are you using, as I started with
EmbeddedSolrServer but it was pretty slow after a few GB(~30+) of indexing.
I started indexing 1 week back and still its 37GB, although I assume
HttpPost mechanism will perform lethargic slow due to network latency and
for the response await. Furthermore I started with CloudSolrServer but
facing some weird exception saying ClassCastException Cannot cast to
Exception while adding the SolrInputDocument to the Server.

CloudSolrServer server1 = new
CloudSolrServer(zkHost:port1,zkHost:port2,zkHost:port3,false);
server1.setDefaultCollection(mycollection);
SolrInputDocument doc = new SolrInputDocument();
doc.addField( ID, 123);
doc.addField( A0_s, 282628854);

server1.add(doc); //Error at this line
server1.commit();

Thanks again Toke for sharing that Stats.


On Fri, Jun 6, 2014 at 5:04 PM, Toke Eskildsen t...@statsbiblioteket.dk
wrote:

 On Fri, 2014-06-06 at 12:32 +0200, Vineet Mishra wrote:
  *Does that mean for querying smoothly we need to have memory atleast
 equal
  or greater to the size of index?

 If you absolutely, positively have to reduce latency as much as
 possible, then yes. With an estimated index size of 2TB, I would guess
 that 10-20 machines with powerful CPUs (1 per shard per expected
 concurrent request) would also be advisable. While you're at it, do make
 sure that you're using high-speed memory.

 That was not a serious suggestion, should you be in doubt. Very few
 people need the best latency possible. Most just need the individual
 searches to be fast enough and want to scale throughput instead.

  As in my case the index size will be very heavy(~2TB) and practically
  speaking that amount of memory is not possible. Even If it goes to
  multiple shards, say around 10 Shards then also 200GB of RAM will not
  be an feasible option.

 We're building a projected 24TB index collection and are currently at
 2.7TB+, growing with about 1TB/10 days. Our current plan is to use a
 single machine with 256GB of RAM, but we will of course adjust along the
 way if it proves to be too small.

 Requirements differ with the corpus and the needs, but for us, SSDs as
 storage seems to provide quite enough of a punch. I did a little testing
 yesterday: https://plus.google.com/u/0/+TokeEskildsen/posts/4yPvzrQo8A7

 tl;dr: for small result sets ( 1M hits) on unwarmed searches with
 simple queries, response time is below 100ms. If we enable faceting with
 plain Solr, this jumps to about 1 second.

 I did a top on the machine and it says that 50GB is currently used for
 caching, so an 80GB (and probably less) machine would work fine for our
 2.7TB index.


 - Toke Eskildsen, State and University Library, Denmark





Re: Solr maximum Optimal Index Size per Shard

2014-06-06 Thread Vineet Mishra
Earlier I used to index with HtttpPost Mechanism only, making each post
size specific to 2Mb to 20Mb that was going fine, but we had a suspect that
instead of indexing through network call(which ofcourse results in latency
due to network delays and http protocol) if we can index Offline by just
writing the index and dumping it to Shards it would be much better.

Although I am doing commit with a batch of 25K docs which I will try to
replace with CommitWithin(seems it works faster) or probably have a look at
this Binary Prot.

Thanks!




On Fri, Jun 6, 2014 at 5:55 PM, Toke Eskildsen t...@statsbiblioteket.dk
wrote:

 On Fri, 2014-06-06 at 14:05 +0200, Vineet Mishra wrote:

  Could you state what indexing mechanism are you using, as I started
  with EmbeddedSolrServer but it was pretty slow after a few GB(~30+) of
  indexing.

 I suspect that is due to too-frequent commits, too small heap or
 something third, unrelated to EmbeddedSolrServer itself. Underneath the
 surface it is just the same as a standalone Solr.

 We're building our ~1TB indexes individually, using standalone workers
 for the heavy part of the analysis (Tika). The delivery from the workers
 to the Solr server is over the network, using the Solr binary protocol.
 My colleague Thomas Egense just created a small write-up at
 https://github.com/netarchivesuite/netsearch

   I started indexing 1 week back and still its 37GB, although I assume
  HttpPost mechanism will perform lethargic slow due to network latency
  and for the response await.

 Maybe if you send the documents one at a time, but if you bundle them in
 larger updates, the post-method should be fine.

 - Toke Eskildsen, State and University Library, Denmark





Re: Solr maximum Optimal Index Size per Shard

2014-06-04 Thread Vineet Mishra
Thanks all for your response.
I presume this conversation concludes that indexing around 1Billion
documents per shard won't be a problem, as I have 10 Billion docs to index,
so approx 10 shards with 1 Billion each should be fine with it and how
about Memory, what size of RAM should be fine for this amount of data?
Moreover what should be the indexing technique for this huge data set, as
currently I am indexing with EmbeddedSolrServer but its going pathetically
slow after some 20Gb of indexing. Comparatively SolrHttpPost was slow due
to network delays and response but after this long running the indexing
with EmbeddedSolrServer I am getting a different notion.
Any good indexing technique for this huge dataset would be highly
appreciated.

Thanks again!


On Wed, Jun 4, 2014 at 6:40 AM, rulinma ruli...@gmail.com wrote:

 mark.



 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/Solr-maximum-Optimal-Index-Size-per-Shard-tp4139565p4139698.html
 Sent from the Solr - User mailing list archive at Nabble.com.



Solr maximum Optimal Index Size per Shard

2014-06-03 Thread Vineet Mishra
Hi All,

Has anyone came across the maximum threshold document or size wise for each
core of solr to hold.
As I have indexed some 10 Million Documents of 18Gb and when I index
another 5 (9Gb)Million Documents on top of these indexes it responds little
slow with Stats query.

Considering I have around 2Tb of data to index what should be an
appropriate balanced proportionate of Data vs # of Shards.

Its more of a indexing Big data for NRT.
Looking forward for your response.
Urgent!

Thanks!


Re: Offline Indexes Update to Shard

2014-06-02 Thread Vineet Mishra
Hi Otis,

I have to index some huge amount of data that's around Billions of records,
since indexing via HTTP post mechanism will be a slow and lethargic due to
network delay hence I am indexing through EmbeddedSolrServer to create
index which I can later upload to different Shards in SolrCloud, although
copy pasting the index is possible but I was looking out for some other
alternative which can take care of copying it to shard and its replicas.
Is copying manually good and only approach because the index size may grow
upto a TB or so.

Looking out for your response.

Thanks!


On Thu, May 29, 2014 at 7:52 PM, Otis Gospodnetic 
otis.gospodne...@gmail.com wrote:

 Hi,

 On Wed, May 28, 2014 at 4:25 AM, Vineet Mishra clearmido...@gmail.com
 wrote:

  Hi All,
 
  Has anyone tried with building Offline indexes with EmbeddedSolrServer
 and
  posting it to Shards.
 

 What do you mean by posting it to shards?  How is that different than
 copying them manually to the right location in FS?  Could you please
 elaborate?

 Otis
 --
 Performance Monitoring * Log Analytics * Search Analytics
 Solr  Elasticsearch Support * http://sematext.com/



  FYI, I am done building the indexes but looking out for a way to post
 these
  index files on shards.
  Copying the indexes manually to each shard's replica is possible and is
  working fine but I don't want to go with that approach.
 
  Thanks!
 



Re: Offline Indexes Update to Shard

2014-06-02 Thread Vineet Mishra
Hi Erick,

Thanks for your mail, please let me go through with my use case.
I am having around 20-40 Billion Records to index with each record is
having around 200-400 fields, the data is sensor data so it can be easily
stored in Integer or Float. Now to index this huge amount of data I am
going with the indexing through EmbeddedSolrServer which was working fine
but I was looking out for a way to move these generated indexes to
different shards possibly without copying pasting it to each machines but
some other approach as to submit this indexes to some shard and let the
shard take care of it distributing it over leader and replica.
I want to mention one more thing, as I started indexing with EmbeddedSolrServer
it went fine for some million of starting documents but there after the
indexing speed is pathetically slow, it indexed around 20GB in a day and
just have indexed 9 GB in another 2 days.
Any indexing optimization approach also requested.

Hope this makes things much clearer.
Looking forward to soon hear from you.

Thanks and Regards!


On Fri, May 30, 2014 at 9:09 PM, Erick Erickson erickerick...@gmail.com
wrote:

 You can copy to the shards and use the mergindexes command, the
 MapReduceIndexerTool follows that approach.

 But really, what is the higher-level use-case you're trying to support?
 This feels a little like an XY problem. You could do things like
 1 index to a different collection then use collection aliasing to switch
 2 just re-index to the current collection.
 3 use the MapReduceIndexerTool (admittedly it needs Hadoop).

 All in all, it feels like you're doing work you don't need to do. But
 that's a guess since you haven't told us what the use-case is.

 Best,
 Erick


 On Thu, May 29, 2014 at 7:22 AM, Otis Gospodnetic 
 otis.gospodne...@gmail.com wrote:

  Hi,
 
  On Wed, May 28, 2014 at 4:25 AM, Vineet Mishra clearmido...@gmail.com
  wrote:
 
   Hi All,
  
   Has anyone tried with building Offline indexes with EmbeddedSolrServer
  and
   posting it to Shards.
  
 
  What do you mean by posting it to shards?  How is that different than
  copying them manually to the right location in FS?  Could you please
  elaborate?
 
  Otis
  --
  Performance Monitoring * Log Analytics * Search Analytics
  Solr  Elasticsearch Support * http://sematext.com/
 
 
 
   FYI, I am done building the indexes but looking out for a way to post
  these
   index files on shards.
   Copying the indexes manually to each shard's replica is possible and is
   working fine but I don't want to go with that approach.
  
   Thanks!
  
 



Re: Offline Indexes Update to Shard

2014-06-02 Thread Vineet Mishra
Hi Wolfgang,

Thanks for your response, can you quote some running example of
MapReduceIndexerTool
for indexing through csv files.
If you are referring to
http://www.cloudera.com/content/cloudera-content/cloudera-docs/Search/latest/Cloudera-Search-User-Guide/csug_mapreduceindexertool.html?scroll=csug_topic_6_1

I had a few points to clarify,
*what is the morphline?
*Is it necessary to use morphline for indexing, if yes how to create one?
*can Index only reside on HDFS and not on LocalFS?
*what is the minimum cdh version supported for it?

Looking forward to your response.

Thanks!


On Mon, Jun 2, 2014 at 2:24 PM, Wolfgang Hoschek whosc...@cloudera.com
wrote:

 Sounds like you should consider using MapReduceIndexerTool. AFAIK, this is
 the most scalable indexing (and merging) solution out there.

 Wolfgang.

 On Jun 2, 2014, at 10:33 AM, Vineet Mishra clearmido...@gmail.com wrote:

  Hi Erick,
 
  Thanks for your mail, please let me go through with my use case.
  I am having around 20-40 Billion Records to index with each record is
  having around 200-400 fields, the data is sensor data so it can be easily
  stored in Integer or Float. Now to index this huge amount of data I am
  going with the indexing through EmbeddedSolrServer which was working fine
  but I was looking out for a way to move these generated indexes to
  different shards possibly without copying pasting it to each machines but
  some other approach as to submit this indexes to some shard and let the
  shard take care of it distributing it over leader and replica.
  I want to mention one more thing, as I started indexing with
 EmbeddedSolrServer
  it went fine for some million of starting documents but there after the
  indexing speed is pathetically slow, it indexed around 20GB in a day and
  just have indexed 9 GB in another 2 days.
  Any indexing optimization approach also requested.
 
  Hope this makes things much clearer.
  Looking forward to soon hear from you.
 
  Thanks and Regards!
 
 
  On Fri, May 30, 2014 at 9:09 PM, Erick Erickson erickerick...@gmail.com
 
  wrote:
 
  You can copy to the shards and use the mergindexes command, the
  MapReduceIndexerTool follows that approach.
 
  But really, what is the higher-level use-case you're trying to support?
  This feels a little like an XY problem. You could do things like
  1 index to a different collection then use collection aliasing to
 switch
  2 just re-index to the current collection.
  3 use the MapReduceIndexerTool (admittedly it needs Hadoop).
 
  All in all, it feels like you're doing work you don't need to do. But
  that's a guess since you haven't told us what the use-case is.
 
  Best,
  Erick
 
 
  On Thu, May 29, 2014 at 7:22 AM, Otis Gospodnetic 
  otis.gospodne...@gmail.com wrote:
 
  Hi,
 
  On Wed, May 28, 2014 at 4:25 AM, Vineet Mishra clearmido...@gmail.com
  wrote:
 
  Hi All,
 
  Has anyone tried with building Offline indexes with EmbeddedSolrServer
  and
  posting it to Shards.
 
 
  What do you mean by posting it to shards?  How is that different than
  copying them manually to the right location in FS?  Could you please
  elaborate?
 
  Otis
  --
  Performance Monitoring * Log Analytics * Search Analytics
  Solr  Elasticsearch Support * http://sematext.com/
 
 
 
  FYI, I am done building the indexes but looking out for a way to post
  these
  index files on shards.
  Copying the indexes manually to each shard's replica is possible and
 is
  working fine but I don't want to go with that approach.
 
  Thanks!
 
 
 




Offline Indexes Update to Shard

2014-05-28 Thread Vineet Mishra
Hi All,

Has anyone tried with building Offline indexes with EmbeddedSolrServer and
posting it to Shards.
FYI, I am done building the indexes but looking out for a way to post these
index files on shards.
Copying the indexes manually to each shard's replica is possible and is
working fine but I don't want to go with that approach.

Thanks!


Indexing Getting Failed

2014-05-16 Thread Vineet Mishra
Hi

I have setup default cloud cluster 4.6.0 with inbuilt Zookeeper running on
Jetty, as I started with indexing till a few thousand it goes fine but soon
after some 5000 documents or so it started giving error(please find below)
and stopped the indexing too as the Zookeeper Leader selection was in
transition, is it the problem due to built in Zookeeper.

*Error Trace:*

ERROR org.apache.solr.core.SolrCore  –
org.apache.solr.common.SolrException: No registered leader was found,
collection:collection1 slice:shard2
at
org.apache.solr.common.cloud.ZkStateReader.getLeaderRetry(ZkStateReader.java:484)
at
org.apache.solr.common.cloud.ZkStateReader.getLeaderRetry(ZkStateReader.java:467)
at
org.apache.solr.update.processor.DistributedUpdateProcessor.setupRequest(DistributedUpdateProcessor.java:223)
at
org.apache.solr.update.processor.DistributedUpdateProcessor.processAdd(DistributedUpdateProcessor.java:428)
at
org.apache.solr.update.processor.LogUpdateProcessor.processAdd(LogUpdateProcessorFactory.java:100)
at
org.apache.solr.handler.loader.JavabinLoader$1.update(JavabinLoader.java:89)
at
org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec$1.readOuterMostDocIterator(JavaBinUpdateRequestCodec.java:151)
at
org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec$1.readIterator(JavaBinUpdateRequestCodec.java:131)
at org.apache.solr.common.util.JavaBinCodec.readVal(JavaBinCodec.java:223)
at
org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec$1.readNamedList(JavaBinUpdateRequestCodec.java:116)
at org.apache.solr.common.util.JavaBinCodec.readVal(JavaBinCodec.java:188)
at org.apache.solr.common.util.JavaBinCodec.unmarshal(JavaBinCodec.java:114)
at
org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec.unmarshal(JavaBinUpdateRequestCodec.java:158)
at
org.apache.solr.handler.loader.JavabinLoader.parseAndLoadDocs(JavabinLoader.java:99)
at org.apache.solr.handler.loader.JavabinLoader.load(JavabinLoader.java:58)
at
org.apache.solr.handler.UpdateRequestHandler$1.load(UpdateRequestHandler.java:92)
at
org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:74)
at
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1859)
at
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:710)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:413)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:197)
at
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1419)
at
org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:455)
at
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:137)
at
org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:557)
at
org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:231)
at
org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1075)
at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:384)
at
org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:193)
at
org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1009)
at
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135)
at
org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:255)
at
org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:154)
at
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116)
at org.eclipse.jetty.server.Server.handle(Server.java:368)
at
org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:489)
at
org.eclipse.jetty.server.BlockingHttpConnection.handleRequest(BlockingHttpConnection.java:53)
at
org.eclipse.jetty.server.AbstractHttpConnection.content(AbstractHttpConnection.java:953)
at
org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.content(AbstractHttpConnection.java:1014)
at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:953)
at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:240)
at
org.eclipse.jetty.server.BlockingHttpConnection.handle(BlockingHttpConnection.java:72)
at
org.eclipse.jetty.server.bio.SocketConnector$ConnectorEndPoint.run(SocketConnector.java:264)
at
org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608)
at
org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543)
at java.lang.Thread.run(Thread.java:679)

Any Suggestion would be appreciated.

Thanks!


Re: Inconsistent response from Cloud Query

2014-05-15 Thread Vineet Mishra
Hi Shawn,

There is no recovery case for me, neither the commit is pending.
The case I am talking about is when I restart the Cloud all over again with
index already flushed to disk.

Thanks!


On Sun, May 11, 2014 at 10:17 PM, Shawn Heisey s...@elyograg.org wrote:

 On 5/9/2014 11:42 AM, Cool Techi wrote:
  We have noticed Solr returns in-consistent results during replica
 recovery and not all replicas are in the same state, so when your query
 goes to a replica which might be recovering or still copying the index then
 the counts may differ.
  regards,Ayush

 SolrCloud should never send requests to a replica that is recovering.
 If that is happening (which I think is unlikely), then it's a bug.

 If *you* send a request to a replica that is still recovering, I would
 expect SolrCloud to redirect the request elsewhere unless distrib=false
 is used.  I'm not sure whether that actually happens, though.

 Thanks,
 Shawn




Fwd: Inconsistent response from Cloud Query

2014-05-13 Thread Vineet Mishra
Copying.
Community: Looking forward for your response.

-- Forwarded message --
From: Vineet Mishra clearmido...@gmail.com
Date: Mon, May 12, 2014 at 5:57 PM
Subject: Re: Inconsistent response from Cloud Query
To: solr-user@lucene.apache.org


Hi Shawn,

There is no recovery case for me, neither the commit is pending.
The case I am talking about is when I restart the Cloud all over again with
index already flushed to disk.

Thanks!


On Sun, May 11, 2014 at 10:17 PM, Shawn Heisey s...@elyograg.org wrote:

 On 5/9/2014 11:42 AM, Cool Techi wrote:
  We have noticed Solr returns in-consistent results during replica
 recovery and not all replicas are in the same state, so when your query
 goes to a replica which might be recovering or still copying the index then
 the counts may differ.
  regards,Ayush

 SolrCloud should never send requests to a replica that is recovering.
 If that is happening (which I think is unlikely), then it's a bug.

 If *you* send a request to a replica that is still recovering, I would
 expect SolrCloud to redirect the request elsewhere unless distrib=false
 is used.  I'm not sure whether that actually happens, though.

 Thanks,
 Shawn




Inconsistent response from Cloud Query

2014-05-06 Thread Vineet Mishra
Hi All,

I have setup cloud-4.6.2 with default configuration on single machine with
2 shards and 2 replication through
https://cwiki.apache.org/confluence/display/solr/Getting+Started+with+SolrCloud

Cloud was up and running and I indexed the example data xml to it, it went
fine.
Now when I am querying with *distrib=true* it is giving inconsistent
result, sometimes it gives 4 Result Response and sometimes 8(Actual Number)

Has anyone been through the situation. Looking for positive and quick
response.

Thanks!


Re: Indexing Big Data With or Without Solr

2014-04-23 Thread Vineet Mishra
I did it with Tomcat and Zookeeper Ensemble, will mail you the steps
shortly.

Cheers


On Sat, Apr 19, 2014 at 9:09 AM, Aman Tandon amantandon...@gmail.comwrote:

 Vineet please share after you setup for solr cloud
 Are you using jetty or tomcat.?

 On Saturday, April 19, 2014, Vineet Mishra clearmido...@gmail.com wrote:
  Thanks Furkan, I will definitely give it a try then.
 
  Thanks again!
 
 
 
 
  On Tue, Apr 15, 2014 at 7:53 PM, Furkan KAMACI furkankam...@gmail.com
 wrote:
 
  Hi Vineet;
 
  I've been using SolrCloud for such kind of Big Data and I think that you
  should consider to use it. If you have any problems you can ask it here.
 
  Thanks;
  Furkan KAMACI
 
 
  2014-04-15 13:20 GMT+03:00 Vineet Mishra clearmido...@gmail.com:
 
   Hi All,
  
   I have worked with Solr 3.5 to implement real time search on some
 100GB
   data, that worked fine but was little slow on complex queries(Multiple
   group/joined queries).
   But now I want to index some real Big Data(around 4 TB or even more),
 can
   SolrCloud be solution for it if not what could be the best possible
   solution in this case.
  
   *Stats for the previous Implementation:*
   It was Master Slave Architecture with normal Standalone multiple
 instance
   of Solr 3.5. There were around 12 Solr instance running on different
   machines.
  
   *Things to consider for the next implementation:*
   Since all the data is sensor data hence it is the factor of duplicity
 and
   uniqueness.
  
   *Really urgent, please take the call on priority with set of feasible
   solution.*
  
   Regards
  
 
 

 --
 Sent from Gmail Mobile



Re: Indexing Big Data With or Without Solr

2014-04-18 Thread Vineet Mishra
Thanks Furkan, I will definitely give it a try then.

Thanks again!




On Tue, Apr 15, 2014 at 7:53 PM, Furkan KAMACI furkankam...@gmail.comwrote:

 Hi Vineet;

 I've been using SolrCloud for such kind of Big Data and I think that you
 should consider to use it. If you have any problems you can ask it here.

 Thanks;
 Furkan KAMACI


 2014-04-15 13:20 GMT+03:00 Vineet Mishra clearmido...@gmail.com:

  Hi All,
 
  I have worked with Solr 3.5 to implement real time search on some 100GB
  data, that worked fine but was little slow on complex queries(Multiple
  group/joined queries).
  But now I want to index some real Big Data(around 4 TB or even more), can
  SolrCloud be solution for it if not what could be the best possible
  solution in this case.
 
  *Stats for the previous Implementation:*
  It was Master Slave Architecture with normal Standalone multiple instance
  of Solr 3.5. There were around 12 Solr instance running on different
  machines.
 
  *Things to consider for the next implementation:*
  Since all the data is sensor data hence it is the factor of duplicity and
  uniqueness.
 
  *Really urgent, please take the call on priority with set of feasible
  solution.*
 
  Regards
 



Indexing Big Data With or Without Solr

2014-04-15 Thread Vineet Mishra
Hi All,

I have worked with Solr 3.5 to implement real time search on some 100GB
data, that worked fine but was little slow on complex queries(Multiple
group/joined queries).
But now I want to index some real Big Data(around 4 TB or even more), can
SolrCloud be solution for it if not what could be the best possible
solution in this case.

*Stats for the previous Implementation:*
It was Master Slave Architecture with normal Standalone multiple instance
of Solr 3.5. There were around 12 Solr instance running on different
machines.

*Things to consider for the next implementation:*
Since all the data is sensor data hence it is the factor of duplicity and
uniqueness.

*Really urgent, please take the call on priority with set of feasible
solution.*

Regards


Re: SolrCloud with Tomcat

2014-03-10 Thread Vineet Mishra
Hi

Got it working!

Much thanks for you response.


On Sat, Mar 8, 2014 at 7:40 PM, Furkan KAMACI furkankam...@gmail.comwrote:

 Hi;

 Could you check here:

 http://lucene.472066.n3.nabble.com/Error-when-creating-collection-in-Solr-4-6-td4103536.html

 Thanks;
 Furkan KAMACI


 2014-03-07 9:44 GMT+02:00 Vineet Mishra clearmido...@gmail.com:

  Hi
 
  I am installing SolrCloud with 3 External
  Zookeeper(localhost:2181,localhost:2182,localhost:2183) and 2
  Tomcats(localhost:8181,localhost:8182) all available on a single
  Machine(Just for getting started).
  By Following these links
 
  http://myjeeva.com/solrcloud-cluster-single-collection-deployment.html
  http://wiki.apache.org/solr/SolrCloudTomcat
 
  I have got the Solr UI on the machine pointing to
 
  http://localhost:8181/solr/#/~cloud
 
  In the Cloud Graph View it is coming with
 
  mycollection
  |
  |_ shard1
  |_ shard2
 
  But both the shards are empty and showing no cores or replica.
 
  Following
 
 http://myjeeva.com/solrcloud-cluster-single-collection-deployment.htmlblog
  ,
  I have been successful till starting tomcat,
  since after the section Creating Collection, Shard(s), Replica(s) in
  SolrCloud I am facing the problem.
 
  Giving command to create replica for the shard using
 
  *curl
  '
 
 http://localhost:8181/solr/admin/cores?action=CREATEname=shard1-replica-2collection=mycollectionshard=shard1
  
 
 http://localhost:8181/solr/admin/cores?action=CREATEname=shard1-replica-2collection=mycollectionshard=shard1
  '*
 
  it is giving error
 
  response
  lst name=responseHeaderint name=status400/intint
  name=QTime137/int/lstlst name=errorstr name=msg
  *Error CREATEing SolrCore 'shard1-replica-2':
  192.168.2.183:8182_solr_shard1-replica-2 is removed*
  /strint name=code400/int/lst
  /response
 
  Has anybody went through this issue?
 
  Regards
 



SolrCloud with Tomcat

2014-03-06 Thread Vineet Mishra
Hi

I am installing SolrCloud with 3 External
Zookeeper(localhost:2181,localhost:2182,localhost:2183) and 2
Tomcats(localhost:8181,localhost:8182) all available on a single
Machine(Just for getting started).
By Following these links

http://myjeeva.com/solrcloud-cluster-single-collection-deployment.html
http://wiki.apache.org/solr/SolrCloudTomcat

I have got the Solr UI on the machine pointing to

http://localhost:8181/solr/#/~cloud

In the Cloud Graph View it is coming with

mycollection
|
|_ shard1
|_ shard2

But both the shards are empty and showing no cores or replica.

Following
http://myjeeva.com/solrcloud-cluster-single-collection-deployment.htmlblog,
I have been successful till starting tomcat,
since after the section Creating Collection, Shard(s), Replica(s) in
SolrCloud I am facing the problem.

Giving command to create replica for the shard using

*curl
'http://localhost:8181/solr/admin/cores?action=CREATEname=shard1-replica-2collection=mycollectionshard=shard1
http://localhost:8181/solr/admin/cores?action=CREATEname=shard1-replica-2collection=mycollectionshard=shard1'*

it is giving error

response
lst name=responseHeaderint name=status400/intint
name=QTime137/int/lstlst name=errorstr name=msg
*Error CREATEing SolrCore 'shard1-replica-2':
192.168.2.183:8182_solr_shard1-replica-2 is removed*
/strint name=code400/int/lst
/response

Has anybody went through this issue?

Regards


Re: Fault Tolerant Technique of Solr Cloud

2014-02-27 Thread Vineet Mishra
Hi Per

Thanks for your response, got it working.
But moreover I was more interested in querying the same Cloud from UI in a
case of one of the server down and querying the same server to get
collection result. But I guess thats not possible.
Thanks!




On Mon, Feb 24, 2014 at 7:36 PM, Per Steffensen st...@designware.dk wrote:

 On 24/02/14 13:04, Vineet Mishra wrote:

 Can you brief as how to make a direct call to Zookeeper instead of Cloud
 Collection(as currently I was querying the Cloud something like
 *http://192.168.2.183:8900/solr/collection1/select?q=*:*
 http://192.168.2.183:8900/solr/collection1/select?q=*:** ) from UI,
 now

 if I assume shard 8900 is down then how can I still make the call.

 It is obvious that you cannot make the call to localhost:8900 - the server
 listening to that port is down. You can make the call to any of the other
 servers, though. Information about which Solr-servers are running is
 available in ZooKeeper, CloudSolrServer reads that information in order to
 know which servers to route requests to. As long as localhost:8900 is down
 it will not route requests to that server.

 You do not make a direct call to ZooKeeper. ZooKeeper is not an HTTP
 server that will receive your calls. It just has information about which
 Solr-servers are up and running. CloudSolrServers takes advantage of that
 information. You really cannot do without CloudSolrServer (or at least
 LBHttpSolrServer), unless you write a component that can do the same thing
 in some other language (if the reason you do not want to use
 CloudSolrServer, is that your client is not java). Else you need to do
 other clever stuff, like e.g. what Shalin suggests.

 Regards, Per Steffensen



Scalability Limit of SolrCloud

2014-02-27 Thread Vineet Mishra
Hi All

What is the Scalability Limit of CloudSolr, can it reach to index Billions
of Documents and each document containing 400-500 Number Field(probably
Float or Double).
Is it possible and feasible to go with current CloudSolr Architecture or
are there some other alternative or replacement.

Regards


Re: Fault Tolerant Technique of Solr Cloud

2014-02-24 Thread Vineet Mishra
Can you brief as how to make a direct call to Zookeeper instead of Cloud
Collection(as currently I was querying the Cloud something like
*http://192.168.2.183:8900/solr/collection1/select?q=*:*
http://192.168.2.183:8900/solr/collection1/select?q=*:** ) from UI, now
if I assume shard 8900 is down then how can I still make the call.

I have followed the Apache Tutorial(with separate zookeeper running on port
2181)

http://wiki.apache.org/solr/SolrCloud

Can you please be more specific in respect to zookeeper distributed calls.

Regards


On Wed, Feb 19, 2014 at 9:45 PM, Per Steffensen st...@designware.dk wrote:

 On 19/02/14 07:57, Vineet Mishra wrote:

 Thanks for all your response but my doubt is which *Server:Port* should
 the

 query be made as we don't know the crashed server or which server might
 crash in the future(as any server can go down).

 That is what CloudSolrServer will deal with for you. It knows which
 servers are down and make sure not to send request to those servers.


 The only intention for writing this doubt is to get an idea about how the
 query format for distributed search might work if any of the shard or
 replica goes down.


 // Setting up your CloudSolrServer-client
 CloudSolrServer client=  new  CloudSolrServer(zkConnectionStr);  //
 zkConnectionStr being the same string as you provide in -D|zkHost when
 starting your servers
 |client.setDefaultCollection(collection1);
 client.connect();

 // Creating and firing queries (you can do it in different way, but at
 least this is an option)
 SolrQuery query = new SolrQuery(*:*);
 QueryResponse results = client.query(query);


 Because you are using CloudSolrServer you do not have to worry about not
 sending the request to a crashed server.

 In your example I believe the situation is as follows:
 * One collection called collection1 with two shards shard1 and
 shard2 each having two replica replica1 and replica2 (a replica is an
 instance of a shard, and when you have one replica you are not having
 replication).
 * collection1.shard1.replica1 is running on localhost:8983 and
 collection1.shard1.replica2 is running on localhost:8900 (or maybe switched)
 * collection1.shard2.replica1 is running on localhost:7574 and
 collection1.shard2.replica2 is running on localhost:7500 (or maybe switched)
 If localhost:8900 is the only server that is down, all data is still
 available for search because every shard has at least on replica running.
 In that case I believe setting shards.tolerant will not make a
 difference. You will get your response no matter what. But if
 localhost:8983 was also down there would no live replica of shard1. I that
 case you will get an exception from you query, indicating that the query
 cannot be carried out over the complete data-set. In that case if you set
 shards.tolerant that behaviour will change, and you will not get an
 exception - you will get a real response, but it will just not include data
 from shard1, because it is not available at the moment. That is just the
 way I believe shards.tolerant works, but you might want to verify that.

 To set shards.tolerant:

 SolrQuery query = new SolrQuery(*:*);
 query.set(shards.tolerant, true);
 QueryResponse results = client.query(query);


 Believe distributes search is default, but you can explicitly require it by

 query.setDistrib(true);

 or

 query.set(distrib, true);


 Thanks





Fault Tolerant Technique of Solr Cloud

2014-02-18 Thread Vineet Mishra
Hi All,

I want to have clear idea about the Fault Tolerant Capability of SolrCloud

Considering I have setup the SolrCloud with a external Zookeeper, 2 shards,
each having a replica with single collection as given in the official Solr
Documentation.

https://cwiki.apache.org/confluence/display/solr/Getting+Started+with+SolrCloud

   *Collection1*
 /\
   /\
 /\
   /\
 /\
/   \
*Shard 1 Shard 2*
localhost:8983localhost:7574
localhost:8900localhost:7500


I Indexed some document and then if I shutdown any of the replica or Leader
say for ex- *localhost:8900*, I can't query to the collection to that
particular port

http:/*/localhost:8900*/solr/collection1/select?q=*:*

Then how is it Fault Tolerant or how the query has to be made.

Regards


Re: Fault Tolerant Technique of Solr Cloud

2014-02-18 Thread Vineet Mishra
Thanks for all your response but my doubt is which *Server:Port* should the
query be made as we don't know the crashed server or which server might
crash in the future(as any server can go down).

The only intention for writing this doubt is to get an idea about how the
query format for distributed search might work if any of the shard or
replica goes down.

Thanks


On Tue, Feb 18, 2014 at 11:22 PM, Shawn Heisey s...@elyograg.org wrote:

 On 2/18/2014 8:32 AM, Shawn Heisey wrote:

 On 2/18/2014 6:05 AM, Vineet Mishra wrote:

 *Shard 1 Shard 2*
 localhost:8983localhost:7574
 localhost:8900localhost:7500


 I Indexed some document and then if I shutdown any of the replica or
 Leader
 say for ex- *localhost:8900*, I can't query to the collection to that
 particular port

 http:/*/localhost:8900*/solr/collection1/select?q=*:*

 Then how is it Fault Tolerant or how the query has to be made.

 What is the complete error you are getting?  If you don't see the error
 in the response, you'll need to find your Solr Logfile and look for the
 error (including a large java stacktrace) there.


 Good catch by Per.  I did not notice that you were trying to send the
 query to the server that you took down.  This isn't going to work -- if the
 software you're trying to reach is not running, it won't respond.  Think
 about what happens if you are sending requests to a server and it crashes
 completely.

 If you want to always send to the same host/port, you will need a load
 balancer listening on that port.  You'll also want something that maintains
 a shared IP address, so that if the machine dies, the IP address and the
 load balancer move to another machine.  Haproxy and Pacemaker work very
 well as a combination for this.  There are many other choices, both
 hardware and software.

 Per also mentioned the other option - you can write code that knows about
 multiple URLs and can switch between them.  This is something you get for
 free with CloudSolrServer when writing Java code with SolrJ.

 Thanks,
 Shawn




pool-1-thread-4 java.lang.NoSuchMethodError: org.apache.solr.util.SimplePostTool

2013-12-06 Thread Vineet Mishra
pool-1-thread-4 java.lang.NoSuchMethodError:
org.apache.solr.util.SimplePostTool


I am getting this error while posting Data to Solr from XML generated file.
Although the Solr post.jar is present in the Library Class Path and I also
tried keeping the Source class of the Post Tool.

Urgent Call.

Thanks!


Making a Web Request is failing with 403 Request Forbidden

2013-10-30 Thread Vineet Mishra
Hi All,

I am making web server call to a website for Shortening the links, that is
bit.ly but recieving a 403 Request Forbidden.
Although if I use their webpage to short the web link its working good.
Can any body tell me what might be the reason for such a vague behavior.

Here is the code included.

String url = https://bitly.com/shorten/;;
 StringBuffer response;
try {
URL obj = new URL(url);
 HttpsURLConnection con = (HttpsURLConnection) obj.openConnection();

//add reuqest header
 con.setRequestMethod(POST);
con.setRequestProperty(User-Agent, Mozilla/5.0 (X11; Linux x86_64)
AppleWebKit/537.22 (KHTML, like Gecko) Ubuntu Chromium/25.0.1364.160
Chrome/25.0.1364.160 Safari/537.22);
 con.setRequestProperty(Accept-Language, en-US,en;q=0.8);
con.setRequestProperty(Accept,
text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8);
 con.setRequestProperty(Accept-Charset, ISO-8859-1,utf-8;q=0.7,*;q=0.3);
con.setRequestProperty(Content-Type, application/x-www-form-urlencoded);
 con.setRequestProperty(Host, bitly.com);

 String urlParameters = url=
http://bit.ly/1f3aLrPie=utf-8oe=utf-8gws_rd=crei=sKlwUvPbN8j-rAf-5IDwAQbasic_style=1classic_mode=rapid_shorten_mode=_xsrf=a2b71eaf499c4690a77a21d3c87e6302
;

// Send post request
con.setDoOutput(true);
DataOutputStream wr = new DataOutputStream(con.getOutputStream());
 wr.writeBytes(urlParameters);
wr.flush();
wr.close();

int responseCode = con.getResponseCode();
System.out.println(Response Code :  + responseCode);

BufferedReader in = new BufferedReader(
new InputStreamReader(con.getInputStream()));
 String inputLine;
response = new StringBuffer();

while ((inputLine = in.readLine()) != null) {
 response.append(inputLine);
}
in.close();
 System.out.println(response.toString());
} catch (MalformedURLException e) {
// TODO Auto-generated catch block
 e.printStackTrace();
} catch (ProtocolException e) {
// TODO Auto-generated catch block
 e.printStackTrace();
} catch (IOException e) {
// TODO Auto-generated catch block
 e.printStackTrace();
}

Hoping for your response.

Thanks!


Post Call to Solr RequestHandler

2013-08-09 Thread Vineet Mishra
Hi

Currently I am working with RequestHandler in Solr, where the user defined
query is processed at the class specified by the requesthandler in
Solrconfig.xml.

But my requirement is that I want to make it a Post call rather than a Get
query call.

Is it possible or are there some way we can accomplish querying Solr
RequestHandler with Post Method.

This is an urgent call. Please revert back soon possible.

Thanks
Vineet


Re: Unexpected character '' (code 60) expected '='

2013-08-01 Thread Vineet Mishra
I am using Solr 3.5 with the posting XML file size of just 1Mb.


On Wed, Jul 31, 2013 at 8:19 PM, Shawn Heisey s...@elyograg.org wrote:

 On 7/31/2013 7:16 AM, Vineet Mishra wrote:
  I checked the File. . .nothing is there. I mean the formatting is
 correct,
  its a valid XML file.

 What version of Solr, and how large is your XML file?

 If Solr is older than version 4.1, then the POST buffer limit is decided
 by your container config, which based on your stacktrace, is tomcat.  If
 you have 4.1 or later, then the POST buffer limit is decided by Solr,
 and defaults to 2048KiB.

 Could that be the problem?

 Thanks,
 Shawn




Re: Unexpected character '' (code 60) expected '='

2013-08-01 Thread Vineet Mishra
XML Validation goes good, It does not give any error rather while Running
again it goes for a few lakhs of records and then stops saying

*SimplePostTool: FATAL: IOException while reading response:
java.io.IOException: Incomplete output stream*
*
*
I guess this is the issue of Threads Resource Management with Lock.


On Thu, Aug 1, 2013 at 1:50 PM, Paul Masurel paul.masu...@gmail.com wrote:

 You can check for your xml validity with xmllint very simply.

 xmllint file

 Does this return an error?



 On Thu, Aug 1, 2013 at 9:59 AM, deniz denizdurmu...@gmail.com wrote:

  Vineet Mishra wrote
   I am using Solr 3.5 with the posting XML file size of just 1Mb.
  
  
   On Wed, Jul 31, 2013 at 8:19 PM, Shawn Heisey lt;
 
   solr@
 
   gt; wrote:
  
   On 7/31/2013 7:16 AM, Vineet Mishra wrote:
I checked the File. . .nothing is there. I mean the formatting is
   correct,
its a valid XML file.
  
   What version of Solr, and how large is your XML file?
  
   If Solr is older than version 4.1, then the POST buffer limit is
 decided
   by your container config, which based on your stacktrace, is tomcat.
  If
   you have 4.1 or later, then the POST buffer limit is decided by Solr,
   and defaults to 2048KiB.
  
   Could that be the problem?
  
   Thanks,
   Shawn
  
  
 
 
  you might need to escape some chars like  to lt; and so on
 
 
 
  -
  Zeki ama calismiyor... Calissa yapar...
  --
  View this message in context:
 
 http://lucene.472066.n3.nabble.com/Unexpected-character-code-60-expected-tp4081603p4081854.html
  Sent from the Solr - User mailing list archive at Nabble.com.
 



 --
 __

  Masurel Paul
  e-mail: paul.masu...@gmail.com



SimplePostTool: FATAL: Solr returned an error #400 Bad Request

2013-07-31 Thread Vineet Mishra
Hi All

Currently I am in a mid of a project which Index some data to Solrs
multiple instance.

I have the Configuration as, on the same machine I have made multiple
instances of Solr

http://localhost:8080/solr1
http://localhost:8080/solr2
http://localhost:8080/solr3
http://localhost:8080/solr4
http://localhost:8080/solr5
http://localhost:8080/solr6

Now when I am posting the Data to Solr through SimplePostTool by passing a
xml file in spt.postFile(file) method and committing it there after.

This all process is Multithreaded and works fine till 1 Million of data
record but there after it suddenly stops saying,

*SimplePostTool: FATAL: Solr returned an error #400 Bad Request*
*
*
in the Tomcat Catalina I found

*WARNING: Failed to register info bean: searcher*
*javax.management.InstanceAlreadyExistsException:
solr/:type=searcher,id=org.apache.solr.search.SolrIndexSearcher*
* at com.sun.jmx.mbeanserver.Repository.addMBean(Repository.java:437)*
* at
com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerWithRepository(DefaultMBeanServerInterceptor.java:1898)
*
* at
com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerDynamicMBean(DefaultMBeanServerInterceptor.java:966)
*
* at
com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerObject(DefaultMBeanServerInterceptor.java:900)
*
* at
com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerMBean(DefaultMBeanServerInterceptor.java:324)
*
* at
com.sun.jmx.mbeanserver.JmxMBeanServer.registerMBean(JmxMBeanServer.java:513)
*
* at org.apache.solr.core.JmxMonitoredMap.put(JmxMonitoredMap.java:141)*
* at org.apache.solr.core.JmxMonitoredMap.put(JmxMonitoredMap.java:47)*
* at
org.apache.solr.search.SolrIndexSearcher.register(SolrIndexSearcher.java:220)
*
* at org.apache.solr.core.SolrCore.registerSearcher(SolrCore.java:1349)*
* at org.apache.solr.core.SolrCore.access$000(SolrCore.java:84)*
* at org.apache.solr.core.SolrCore$5.call(SolrCore.java:1247)*
* at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)*
* at java.util.concurrent.FutureTask.run(FutureTask.java:166)*
* at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
*
* at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
*
* at java.lang.Thread.run(Thread.java:722)*
*
*
*Jul 31, 2013 12:46:00 PM org.apache.solr.core.SolrCore registerSearcher*
*INFO: [] Registered new searcher Searcher@5fa1891b main*
*Jul 31, 2013 12:46:00 PM org.apache.solr.search.SolrIndexSearcher close*

Has anybody traced such issue. Please this is really very Urgent and
Important. Waiting for your response.

Thanks and Regards
Vineet


Re: SimplePostTool: FATAL: Solr returned an error #400 Bad Request

2013-07-31 Thread Vineet Mishra
I got it resolved, actually the error trace was even more above this one.
It was just that the posting XML was not forming properly for the Solr
field *Date* which usually takes the format

*2006-07-15T22:18:48Z*
*
*
This is the standard format for the Solr date(datatype) which follows
specifically some of the pattern mentioned.


   - 1995-12-31T23:59:59Z
   - 1995-12-31T23:59:59.9Z
   - 1995-12-31T23:59:59.99Z
   - 1995-12-31T23:59:59.999Z


As documented by Solr http://www.meticent.com/DAt( *www.meticent.com/DAt*)

By the way thanks!
Vineet


On Wed, Jul 31, 2013 at 4:47 PM, Erick Erickson erickerick...@gmail.comwrote:

 Probably not the root of your problem, but
 bq: and committing it there after.

 Does that mean you're calling  commit after every
 document? This is usually poor practice, I'd set
 the autocommit intervals on solrconfig.xml and NOT
 call commit explicitly.

 Does the same document fail every time? What does
 it look like?

 You really haven't provided much information
 to go on.

 Best
 Erick

 On Wed, Jul 31, 2013 at 3:55 AM, Vineet Mishra clearmido...@gmail.com
 wrote:
  Hi All
 
  Currently I am in a mid of a project which Index some data to Solrs
  multiple instance.
 
  I have the Configuration as, on the same machine I have made multiple
  instances of Solr
 
  http://localhost:8080/solr1
  http://localhost:8080/solr2
  http://localhost:8080/solr3
  http://localhost:8080/solr4
  http://localhost:8080/solr5
  http://localhost:8080/solr6
 
  Now when I am posting the Data to Solr through SimplePostTool by passing
 a
  xml file in spt.postFile(file) method and committing it there after.
 
  This all process is Multithreaded and works fine till 1 Million of data
  record but there after it suddenly stops saying,
 
  *SimplePostTool: FATAL: Solr returned an error #400 Bad Request*
  *
  *
  in the Tomcat Catalina I found
 
  *WARNING: Failed to register info bean: searcher*
  *javax.management.InstanceAlreadyExistsException:
  solr/:type=searcher,id=org.apache.solr.search.SolrIndexSearcher*
  * at com.sun.jmx.mbeanserver.Repository.addMBean(Repository.java:437)*
  * at
 
 com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerWithRepository(DefaultMBeanServerInterceptor.java:1898)
  *
  * at
 
 com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerDynamicMBean(DefaultMBeanServerInterceptor.java:966)
  *
  * at
 
 com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerObject(DefaultMBeanServerInterceptor.java:900)
  *
  * at
 
 com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerMBean(DefaultMBeanServerInterceptor.java:324)
  *
  * at
 
 com.sun.jmx.mbeanserver.JmxMBeanServer.registerMBean(JmxMBeanServer.java:513)
  *
  * at org.apache.solr.core.JmxMonitoredMap.put(JmxMonitoredMap.java:141)*
  * at org.apache.solr.core.JmxMonitoredMap.put(JmxMonitoredMap.java:47)*
  * at
 
 org.apache.solr.search.SolrIndexSearcher.register(SolrIndexSearcher.java:220)
  *
  * at org.apache.solr.core.SolrCore.registerSearcher(SolrCore.java:1349)*
  * at org.apache.solr.core.SolrCore.access$000(SolrCore.java:84)*
  * at org.apache.solr.core.SolrCore$5.call(SolrCore.java:1247)*
  * at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)*
  * at java.util.concurrent.FutureTask.run(FutureTask.java:166)*
  * at
 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
  *
  * at
 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
  *
  * at java.lang.Thread.run(Thread.java:722)*
  *
  *
  *Jul 31, 2013 12:46:00 PM org.apache.solr.core.SolrCore registerSearcher*
  *INFO: [] Registered new searcher Searcher@5fa1891b main*
  *Jul 31, 2013 12:46:00 PM org.apache.solr.search.SolrIndexSearcher close*
 
  Has anybody traced such issue. Please this is really very Urgent and
  Important. Waiting for your response.
 
  Thanks and Regards
  Vineet



Unexpected character '' (code 60) expected '='

2013-07-31 Thread Vineet Mishra
Hi All

I am currently stuck in a Solr Issue while Posting some data to Solr Server.

I have some record from Hbase which I am posting to Solr, but after posting
some 1 Million of data records, it suddenly stopped. Checking the Catalina
log trace it showed,

*org.apache.solr.common.SolrException: Unexpected character '' (code 60)
expected '='*
*
*
*
*
I am not sure whether its the issue with some malformed data for the
posting, because whatever xml file which I am generating before posting I
have tried posting that specific file to the solr and its going well.

Below is the whole log trace,


*SEVERE: org.apache.solr.common.SolrException: Unexpected character ''
(code 60) expected '='*
* at [row,col {unknown-source}]: [20281,18]*
* at org.apache.solr.handler.XMLLoader.load(XMLLoader.java:81)*
* at
org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:58)
*
* at
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129)
*
* at org.apache.solr.core.SolrCore.execute(SolrCore.java:1398)*
* at
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:356)
*
* at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:252)
*
* at
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
*
* at
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
*
* at
org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
*
* at
org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
*
* at
org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127)
*
* at
org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:103)
*
* at
org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
*
* at
org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:293)*
* at
org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:861)*
* at
org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:606)
*
* at org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:489)
*
* at java.lang.Thread.run(Thread.java:722)*
*Caused by: com.ctc.wstx.exc.WstxUnexpectedCharException: Unexpected
character '' (code 60) expected '='*
* at [row,col {unknown-source}]: [20281,18]*
* at
com.ctc.wstx.sr.StreamScanner.throwUnexpectedChar(StreamScanner.java:648)*
* at
com.ctc.wstx.sr.BasicStreamReader.handleNsAttrs(BasicStreamReader.java:3001)
*
* at
com.ctc.wstx.sr.BasicStreamReader.handleStartElem(BasicStreamReader.java:2936)
*
* at
com.ctc.wstx.sr.BasicStreamReader.nextFromTree(BasicStreamReader.java:2848)*
* at com.ctc.wstx.sr.BasicStreamReader.next(BasicStreamReader.java:1019)*
* at org.apache.solr.handler.XMLLoader.readDoc(XMLLoader.java:295)*
* at org.apache.solr.handler.XMLLoader.processUpdate(XMLLoader.java:157)*
* at org.apache.solr.handler.XMLLoader.load(XMLLoader.java:79)*
* ... 17 more*
*
*
Has anybody faced this issue.

Thanks and Regards
Vineet


Re: Unexpected character '' (code 60) expected '='

2013-07-31 Thread Vineet Mishra
I checked the File. . .nothing is there. I mean the formatting is correct,
its a valid XML file.


On Wed, Jul 31, 2013 at 6:38 PM, Markus Jelsma
markus.jel...@openindex.iowrote:

 This file is malformed:

 *SEVERE: org.apache.solr.common.SolrException: Unexpected character ''
 (code 60) expected '='*
 * at [row,col {unknown-source}]: [20281,18]*

 Check row 20281 column 18


 -Original message-
  From:Vineet Mishra clearmido...@gmail.com
  Sent: Wednesday 31st July 2013 15:05
  To: solr-user@lucene.apache.org
  Subject: Unexpected character 'lt;' (code 60) expected '='
 
  Hi All
 
  I am currently stuck in a Solr Issue while Posting some data to Solr
 Server.
 
  I have some record from Hbase which I am posting to Solr, but after
 posting
  some 1 Million of data records, it suddenly stopped. Checking the
 Catalina
  log trace it showed,
 
  *org.apache.solr.common.SolrException: Unexpected character '' (code 60)
  expected '='*
  *
  *
  *
  *
  I am not sure whether its the issue with some malformed data for the
  posting, because whatever xml file which I am generating before posting I
  have tried posting that specific file to the solr and its going well.
 
  Below is the whole log trace,
 
 
  *SEVERE: org.apache.solr.common.SolrException: Unexpected character ''
  (code 60) expected '='*
  * at [row,col {unknown-source}]: [20281,18]*
  * at org.apache.solr.handler.XMLLoader.load(XMLLoader.java:81)*
  * at
 
 org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:58)
  *
  * at
 
 org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129)
  *
  * at org.apache.solr.core.SolrCore.execute(SolrCore.java:1398)*
  * at
 
 org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:356)
  *
  * at
 
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:252)
  *
  * at
 
 org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
  *
  * at
 
 org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
  *
  * at
 
 org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
  *
  * at
 
 org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
  *
  * at
 
 org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127)
  *
  * at
 
 org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:103)
  *
  * at
 
 org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
  *
  * at
 
 org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:293)*
  * at
 
 org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:861)*
  * at
 
 org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:606)
  *
  * at
 org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:489)
  *
  * at java.lang.Thread.run(Thread.java:722)*
  *Caused by: com.ctc.wstx.exc.WstxUnexpectedCharException: Unexpected
  character '' (code 60) expected '='*
  * at [row,col {unknown-source}]: [20281,18]*
  * at
 
 com.ctc.wstx.sr.StreamScanner.throwUnexpectedChar(StreamScanner.java:648)*
  * at
 
 com.ctc.wstx.sr.BasicStreamReader.handleNsAttrs(BasicStreamReader.java:3001)
  *
  * at
 
 com.ctc.wstx.sr.BasicStreamReader.handleStartElem(BasicStreamReader.java:2936)
  *
  * at
 
 com.ctc.wstx.sr.BasicStreamReader.nextFromTree(BasicStreamReader.java:2848)*
  * at com.ctc.wstx.sr.BasicStreamReader.next(BasicStreamReader.java:1019)*
  * at org.apache.solr.handler.XMLLoader.readDoc(XMLLoader.java:295)*
  * at org.apache.solr.handler.XMLLoader.processUpdate(XMLLoader.java:157)*
  * at org.apache.solr.handler.XMLLoader.load(XMLLoader.java:79)*
  * ... 17 more*
  *
  *
  Has anybody faced this issue.
 
  Thanks and Regards
  Vineet
 



Group and performing statistics on groups

2013-07-26 Thread Vineet Mishra
Hi

This is a urgent call, I am grouping the solr documents by a field name and
want to get the Range(Min and Max) value for another field in that group.

StatsComponent works fine on all the document as whole rendering the max
and min of a field, is it possible to get the StatsComponent per group of
the solr.


Thanks and Regards
Vineet


Sorting the Solr Document after clubbing them from multiple instances

2013-07-24 Thread Vineet Mishra
Hi

I have a Master Solr through which I am querying to multiple solr instance
and aggregating their response and responding back to the user.

Now the requirement is that when I get the data querying multiple solr
instance, I want it to be sorted based on some field name.

Say I have 3 Slave Solrs - Solr 1, Solr 2, Solr 3 and I am getting some
sorted response to the master's requesthandler as

Solr 1 - str name=value5/str
str name=value8/str

Solr 2 - str name=value6/str
str name=value9/str

Solr 3 - str name=value2/str
str name=value4/str

but its not sorted, as in this case I will get the response as

str name=value5/str
str name=value8/str
str name=value6/str
str name=value9/str
str name=value2/str
str name=value4/str

but what I want is, merged sorted response as

str name=value2/str
str name=value4/str
str name=value5/str
str name=value6/str
str name=value8/str
str name=value9/str

What currently I am thinking to move on is, I will be creating a map kind
of thing, with the Field to sort and document, and based on that I will go
on with sorting.

Is this a good way to go, or there are some other way round as well?

Thanks
Vineet


Custom RequestHandlerBase XML Response Issue

2013-07-18 Thread Vineet Mishra
Hi all

I am using a Custom RequestHandlerBase where I am querying from multiple
different Solr instance and aggregating their output as a XML Document
using DOM,
now in the RequestHandler's function handleRequestBody(SolrQueryRequest
req, SolrQueryResponse resp) I want to output this XML Document to the user
as a response, but if I write it as a Document or Node by

For Document
response.add(grouped, domResult);
or

response.add(grouped, domNode);

its writing to the user

For Document
com.sun.org.apache.xerces.internal.dom.DocumentImpl:[#document: null]
or
For Node
com.sun.org.apache.xerces.internal.dom.ElementImpl:[arr: null]


Even when the Document is present, because when I convert the Document to
String its coming perfectly, but I don't want it as a String rather I want
it in a XML format.

Please this is very urgent, has anybody worked on this!

Regards
Vineet


Re: Custom RequestHandlerBase XML Response Issue

2013-07-18 Thread Vineet Mishra
Thanks for your response Shalin,
so does that mean that we can't return a XML object in SolrQueryResponse
through Custom RequestHandler?


On Thu, Jul 18, 2013 at 4:04 PM, Shalin Shekhar Mangar 
shalinman...@gmail.com wrote:

 This isn't a Solr issue. Maybe ask on the xerces list?


 On Thu, Jul 18, 2013 at 3:31 PM, Vineet Mishra clearmido...@gmail.com
 wrote:

  Hi all
 
  I am using a Custom RequestHandlerBase where I am querying from multiple
  different Solr instance and aggregating their output as a XML Document
  using DOM,
  now in the RequestHandler's function handleRequestBody(SolrQueryRequest
  req, SolrQueryResponse resp) I want to output this XML Document to the
 user
  as a response, but if I write it as a Document or Node by
 
  For Document
  response.add(grouped, domResult);
  or
 
  response.add(grouped, domNode);
 
  its writing to the user
 
  For Document
  com.sun.org.apache.xerces.internal.dom.DocumentImpl:[#document: null]
  or
  For Node
  com.sun.org.apache.xerces.internal.dom.ElementImpl:[arr: null]
 
 
  Even when the Document is present, because when I convert the Document to
  String its coming perfectly, but I don't want it as a String rather I
 want
  it in a XML format.
 
  Please this is very urgent, has anybody worked on this!
 
  Regards
  Vineet
 



 --
 Regards,
 Shalin Shekhar Mangar.



Re: Custom RequestHandlerBase XML Response Issue

2013-07-18 Thread Vineet Mishra
But it seems it even have something called  XML ResponseWriter

https://svn.apache.org/repos/asf/lucene/dev/trunk/solr/core/src/java/org/apache/solr/response/XMLResponseWriter.java

Wont it be appropriate in my case?
Although I have not implemented it yet but how come there couldn't be any
way to make a SolrQueryResponse in XML format!


On Thu, Jul 18, 2013 at 4:36 PM, Shalin Shekhar Mangar 
shalinman...@gmail.com wrote:

 Solr's response writers support only a few known types. Look at the
 writeVal method in TextResponseWriter:


 https://svn.apache.org/repos/asf/lucene/dev/trunk/solr/core/src/java/org/apache/solr/response/TextResponseWriter.java


 On Thu, Jul 18, 2013 at 4:08 PM, Vineet Mishra clearmido...@gmail.com
 wrote:

  Thanks for your response Shalin,
  so does that mean that we can't return a XML object in SolrQueryResponse
  through Custom RequestHandler?
 
 
  On Thu, Jul 18, 2013 at 4:04 PM, Shalin Shekhar Mangar 
  shalinman...@gmail.com wrote:
 
   This isn't a Solr issue. Maybe ask on the xerces list?
  
  
   On Thu, Jul 18, 2013 at 3:31 PM, Vineet Mishra clearmido...@gmail.com
   wrote:
  
Hi all
   
I am using a Custom RequestHandlerBase where I am querying from
  multiple
different Solr instance and aggregating their output as a XML
 Document
using DOM,
now in the RequestHandler's function
 handleRequestBody(SolrQueryRequest
req, SolrQueryResponse resp) I want to output this XML Document to
 the
   user
as a response, but if I write it as a Document or Node by
   
For Document
response.add(grouped, domResult);
or
   
response.add(grouped, domNode);
   
its writing to the user
   
For Document
com.sun.org.apache.xerces.internal.dom.DocumentImpl:[#document: null]
or
For Node
com.sun.org.apache.xerces.internal.dom.ElementImpl:[arr: null]
   
   
Even when the Document is present, because when I convert the
 Document
  to
String its coming perfectly, but I don't want it as a String rather I
   want
it in a XML format.
   
Please this is very urgent, has anybody worked on this!
   
Regards
Vineet
   
  
  
  
   --
   Regards,
   Shalin Shekhar Mangar.
  
 



 --
 Regards,
 Shalin Shekhar Mangar.



Re: Custom RequestHandlerBase XML Response Issue

2013-07-18 Thread Vineet Mishra
So does that mean there is no way that we can write a XML or JSON object to
the SolrQueryResponse and expect it to be formatted?


Re: Custom RequestHandlerBase XML Response Issue

2013-07-18 Thread Vineet Mishra
My case is like, I have got a few Solr Instances and querying them and
getting their xml response, out of that xml I have to extract a group of
specific xml nodes, later I am combining other solr's response into a
single xml and making a DOM document out of it.

So as you mentioned in your last mail, how can I prepare a combined
response for this xml doc and even if I do I don't think it would work
because the same I am doing in the RequstHandler.





On Thu, Jul 18, 2013 at 6:30 PM, Shalin Shekhar Mangar 
shalinman...@gmail.com wrote:

 Okay, let me explain. If you construct your combined response (why are you
 doing that again?) in the form a Solr NamedList or SolrDocumentList then
 the XMLResponseWriter (which btw uses TextResponseWriter) has no problem
 writing it down as XML. The problem here is that you are giving it an
 object (a DOM Document?) which it doesn't know how to serialize so it just
 calls .toString on it and writes it out.

 As long as you stick a known type into the SolrQueryResponse, you should be
 fine.


 On Thu, Jul 18, 2013 at 6:24 PM, Vineet Mishra clearmido...@gmail.com
 wrote:

  So does that mean there is no way that we can write a XML or JSON object
 to
  the SolrQueryResponse and expect it to be formatted?
 



 --
 Regards,
 Shalin Shekhar Mangar.