Re: Concurrent Indexing and Searching in Solr.

2015-08-08 Thread Nitin Solanki
Thanks Erick for your suggestion. I will remove commit = true and use solr
5.2 and then get back to you again. For further help. Thanks.

On Sat, Aug 8, 2015 at 4:07 AM Erick Erickson erickerick...@gmail.com
wrote:

 bq: So, How much minimum concurrent threads should I run?

 I really can't answer that in the abstract, you'll simply have to
 test.

 I'd prefer SolrJ to post.jar. If you're not going to SolrJ, I'd imagine
 that
 moving from Python to post.jar isn't all that useful.

 But before you do anything, see what really happens when you remove th
 commit=true. That's likely way more important than the rest.

 Best,
 Erick

 On Fri, Aug 7, 2015 at 3:15 PM, Nitin Solanki nitinml...@gmail.com
 wrote:
  Hi Erick,
  posting files to Solr via curl =
  Rather than posting files via curl. Which is better SolrJ or post.jar...
 I
  don't use both things. I wrote a python script for indexing and using
  urllib and urllib2 for indexing data via http.. I don't have any  option
 to
  use SolrJ Right now. How can I do same thing via post.jar in python? Any
  help Please.
 
  indexing with 100 threads is going to eat up a lot of CPU cycles
  = So, How much minimum concurrent threads should I run? And I also need
  concurrent searching. So, How much?
 
  And Thanks for solr 5.2, I will go through that. Thanking for reply.
 Please
  help me..
 
  On Fri, Aug 7, 2015 at 11:51 PM Erick Erickson erickerick...@gmail.com
  wrote:
 
  bq: How much limitations does Solr has related to indexing and searching
  simultaneously? It means that how many simultaneously calls, I made for
  searching and indexing once?
 
  None a-priori. It all depends on the hardware you're throwing at it.
  Obviously
  indexing with 100 threads is going to eat up a lot of CPU cycles that
  can't then
  be devoted to satisfying queries. You need to strike a balance. Do
  seriously
  consider using some other method than posting files to Solr via curl
  or the like,
  that's rarely a robust solution for production.
 
  As for adding the commit=true, this shouldn't be affecting the index
 size,
  I
  suspect you were mislead by something else happening.
 
  Really, remove it or you'll beat up your system hugely. As for the soft
  commit
  interval, that's totally irrelevant when you're committing every
  document. But do
  lengthen it as much as you can. Most of the time when people say real
  time,
  it turns out that 10 seconds is OK. Or 60 seconds is OK.  You have to
 check
  what the _real_ requirement is, it's often not what's stated.
 
  bq: I am using Solr 5.0 version. Is 5.0 almost similar to 5.2 regarding
  indexing and searching data.
 
  Did you read the link I provided? With replicas, 5.2 will index almost
  twice as
  fast. That means (roughly) half the work on the followers is being done,
  freeing up cycles for performing queries.
 
  Best,
  Erick
 
 
  On Fri, Aug 7, 2015 at 2:06 PM, Nitin Solanki nitinml...@gmail.com
  wrote:
   Hi Erick,
 You said that soft commit should be more than 3000 ms.
   Actually, I need Real time searching and that's why I need soft commit
  fast.
  
   commit=true = I made commit=true because , It reduces by indexed data
  size
   from 1.5GB to 500MB on* each shard*. When I did commit=false then, my
   indexed data size was 1.5GB. After changing it to commit=true, then
 size
   reduced to 500MB only. I am not getting how is it?
  
   I am using Solr 5.0 version. Is 5.0 almost similar to 5.2 regarding
   indexing and searching data.
  
   How much limitations does Solr has related to indexing and searching
   simultaneously? It means that how many simultaneously calls, I made
 for
   searching and indexing once?
  
  
   On Fri, Aug 7, 2015 at 9:18 PM Erick Erickson 
 erickerick...@gmail.com
   wrote:
  
   Your soft commit time of 3 seconds is quite aggressive,
   I'd lengthen it to as long as possible.
  
   Ugh, looked at your query more closely. Adding commit=true to every
  update
   request is horrible performance wise. Let your autocommit process
   handle the commits is the first thing I'd do. Second, I'd try going
 to
   SolrJ
   and batching up documents (I usually start with 1,000) or using the
   post.jar
   tool rather than sending them via a raw URL.
  
   I agree with Upayavira, 100 concurrent threads is a _lot_. Also, what
   version of Solr?
   There was a 2x speedup in Solr 5.2, see:
  
 
 http://lucidworks.com/blog/indexing-performance-solr-5-2-now-twice-fast/
  
   One symptom was that the followers were doing way more work than
 the
   leader
   (BTW, using master/slave when talking SolrCloud is a bit
 confusing...)
   which will
   affect query response rates.
  
   Basically, if query response is paramount, you really need to
 throttle
   your indexing,
   there's just a whole lot of work going on here..
  
   Best,
   Erick
  
   On Fri, Aug 7, 2015 at 11:23 AM, Upayavira u...@odoko.co.uk wrote:
How many CPUs do you have? 100 concurrent 

Re: Concurrent Indexing and Searching in Solr.

2015-08-08 Thread Upayavira
If you are using Python, then you can use urllib2, or requests which
is reportedly better, or better still something like pysolr, which makes
life simpler.

Here's a Pull Request that makes pysolr Zookeeper aware, which'll help
if you are using SolrCloud. I hope one day they will merge it:

https://github.com/toastdriven/pysolr/pull/138

Upayavira

On Fri, Aug 7, 2015, at 11:37 PM, Erick Erickson wrote:
 bq: So, How much minimum concurrent threads should I run?
 
 I really can't answer that in the abstract, you'll simply have to
 test.
 
 I'd prefer SolrJ to post.jar. If you're not going to SolrJ, I'd imagine
 that
 moving from Python to post.jar isn't all that useful.
 
 But before you do anything, see what really happens when you remove th
 commit=true. That's likely way more important than the rest.
 
 Best,
 Erick
 
 On Fri, Aug 7, 2015 at 3:15 PM, Nitin Solanki nitinml...@gmail.com
 wrote:
  Hi Erick,
  posting files to Solr via curl =
  Rather than posting files via curl. Which is better SolrJ or post.jar... I
  don't use both things. I wrote a python script for indexing and using
  urllib and urllib2 for indexing data via http.. I don't have any  option to
  use SolrJ Right now. How can I do same thing via post.jar in python? Any
  help Please.
 
  indexing with 100 threads is going to eat up a lot of CPU cycles
  = So, How much minimum concurrent threads should I run? And I also need
  concurrent searching. So, How much?
 
  And Thanks for solr 5.2, I will go through that. Thanking for reply. Please
  help me..
 
  On Fri, Aug 7, 2015 at 11:51 PM Erick Erickson erickerick...@gmail.com
  wrote:
 
  bq: How much limitations does Solr has related to indexing and searching
  simultaneously? It means that how many simultaneously calls, I made for
  searching and indexing once?
 
  None a-priori. It all depends on the hardware you're throwing at it.
  Obviously
  indexing with 100 threads is going to eat up a lot of CPU cycles that
  can't then
  be devoted to satisfying queries. You need to strike a balance. Do
  seriously
  consider using some other method than posting files to Solr via curl
  or the like,
  that's rarely a robust solution for production.
 
  As for adding the commit=true, this shouldn't be affecting the index size,
  I
  suspect you were mislead by something else happening.
 
  Really, remove it or you'll beat up your system hugely. As for the soft
  commit
  interval, that's totally irrelevant when you're committing every
  document. But do
  lengthen it as much as you can. Most of the time when people say real
  time,
  it turns out that 10 seconds is OK. Or 60 seconds is OK.  You have to check
  what the _real_ requirement is, it's often not what's stated.
 
  bq: I am using Solr 5.0 version. Is 5.0 almost similar to 5.2 regarding
  indexing and searching data.
 
  Did you read the link I provided? With replicas, 5.2 will index almost
  twice as
  fast. That means (roughly) half the work on the followers is being done,
  freeing up cycles for performing queries.
 
  Best,
  Erick
 
 
  On Fri, Aug 7, 2015 at 2:06 PM, Nitin Solanki nitinml...@gmail.com
  wrote:
   Hi Erick,
 You said that soft commit should be more than 3000 ms.
   Actually, I need Real time searching and that's why I need soft commit
  fast.
  
   commit=true = I made commit=true because , It reduces by indexed data
  size
   from 1.5GB to 500MB on* each shard*. When I did commit=false then, my
   indexed data size was 1.5GB. After changing it to commit=true, then size
   reduced to 500MB only. I am not getting how is it?
  
   I am using Solr 5.0 version. Is 5.0 almost similar to 5.2 regarding
   indexing and searching data.
  
   How much limitations does Solr has related to indexing and searching
   simultaneously? It means that how many simultaneously calls, I made for
   searching and indexing once?
  
  
   On Fri, Aug 7, 2015 at 9:18 PM Erick Erickson erickerick...@gmail.com
   wrote:
  
   Your soft commit time of 3 seconds is quite aggressive,
   I'd lengthen it to as long as possible.
  
   Ugh, looked at your query more closely. Adding commit=true to every
  update
   request is horrible performance wise. Let your autocommit process
   handle the commits is the first thing I'd do. Second, I'd try going to
   SolrJ
   and batching up documents (I usually start with 1,000) or using the
   post.jar
   tool rather than sending them via a raw URL.
  
   I agree with Upayavira, 100 concurrent threads is a _lot_. Also, what
   version of Solr?
   There was a 2x speedup in Solr 5.2, see:
  
  http://lucidworks.com/blog/indexing-performance-solr-5-2-now-twice-fast/
  
   One symptom was that the followers were doing way more work than the
   leader
   (BTW, using master/slave when talking SolrCloud is a bit confusing...)
   which will
   affect query response rates.
  
   Basically, if query response is paramount, you really need to throttle
   your indexing,
   there's just a 

Re: SolrJ update

2015-08-08 Thread Andrea Gazzarini
Hi Henrique,
I don't believe there's an easy way to do that.

As you noticed, the SolrInputDocument is not an I/O param, that is, it is
not sent back once data has been indexed and this is good, because here
you're sending just one document, but imagine what could happen if you do a
bulk loading...the response would be very very huge!

Although I could imagine some workaround (with a custom
UpdateRequestProcessor and a custom ResponseWriter), the point is that (see
above) I believe it would end in a bad design:

   - if you send one document at time this is *often* considered a bad
   practice;
   - if you send a lot of data the corresponding response would be huge, it
   will contains a lot of new created identifiersand BTW how do you match
   them with your input documents? Sequentially? in this way you won't
   be able to use any *asynch* client

Personally, if that is ok for your context, I'd completely avoid the
problem moving the logic on the client side. I mean, create a UUID field on
Solrj and add that ID to the outcoming document.

Best,
Andrea



2015-08-06 21:39 GMT+02:00 Henrique O. Santos hensan...@gmail.com:

 Hello all,

 I am using SolrJ to do a index update on one of my collections. This
 collection has a uniqueKey id field:

   fields
 field name=id type=string indexed=true stored=true/
 field name=_version_ type=long indexed=true stored=true/
 field name=name type=string indexed=true stored=true/
   /fields
   uniqueKeyid/uniqueKey

 This field is configured to be auto generated on solrconfig.xml like this:

 updateRequestProcessorChain
 processor class=solr.UUIDUpdateProcessorFactory
   str name=fieldNameid/str
 /processor
 processor class=solr.LogUpdateProcessorFactory /
 processor class=solr.RunUpdateProcessorFactory /
 /updateRequestProcessorChain

 On my Java code, I just add the name field to my document and then proceed
 with the add:

 doc.addField(name, this.name);
 solrClient.add(doc);
 solrClient.commit();

 Everything works, the document gets indexed. What I really need is to know
 right away in the code the id that was generated for that single document.
 I have tried looking into the UpdateReponse but no luck.

 Is there any easy way to do that?

 Thank you in advance,
 Henrique.



Re: how to extend JavaBinCodec and make it available in solrj api

2015-08-08 Thread Dmitry Kan
Shalin,

Thanks, can I also introduce custom entity tags like in my example with the
highlighter output?

Dmitry

On Fri, Aug 7, 2015 at 5:10 PM, Shalin Shekhar Mangar 
shalinman...@gmail.com wrote:

 The thing is that you are trying to introduce custom xml tags which
 require changing the response writers. Instead, if you just used
 nested maps/lists or SimpleOrderedMap/NamedList then every response
 writer should be able to just directly write the output. Nesting is
 not a problem.

 On Fri, Aug 7, 2015 at 6:09 PM, Dmitry Kan solrexp...@gmail.com wrote:
  Shawn:
 
  thanks, we found an intermediate solution by serializing our data
 structure
  using string representation, perhaps less optimal than using binary
 format
  directly.
 
  In the original router with JavaBinCodec we found, that
  BinaryResponseWriter should also be extended. But the following method is
  static and does allow extending:
 
  public static NamedListObject getParsedResponse(SolrQueryRequest
  req, SolrQueryResponse rsp) {
try {
  Resolver resolver = new Resolver(req, rsp.getReturnFields());
 
  ByteArrayOutputStream out = new ByteArrayOutputStream();
  new JavaBinCodec(resolver).marshal(rsp.getValues(), out);
 
  InputStream in = new ByteArrayInputStream(out.toByteArray());
  return (NamedListObject) new JavaBinCodec(resolver).unmarshal(in);
}
catch (Exception ex) {
  throw new RuntimeException(ex);
}
  }
 
 
 
  Shalin:
 
  We needed new data structure in highlighter with more nested levels,
  than just one. Something like this (in xml representation):
 
  lst name=highlighting
lst name=doc1
  arr name=snippets
snippet
 
   idid1/id
 
   contentsSnippet text goes here/contents
 
   other params/
 
/snippet
 
  /arr
 
 
/lst/lst
 
  Can this be modelled with existing types?
 
 
  On Thu, Aug 6, 2015 at 9:47 PM, Shalin Shekhar Mangar 
  shalinman...@gmail.com wrote:
 
  What do you mean by a custom format? As long as your custom component
  is writing primitives or NamedList/SimpleOrderedMap or collections
  such as List/Map, any response writer should be able to handle them.
 
  On Wed, Aug 5, 2015 at 5:08 PM, Dmitry Kan solrexp...@gmail.com
 wrote:
   Hello,
  
   Solr: 5.2.1
   class: org.apache.solr.common.util.JavaBinCodec
  
   I'm working on a custom data structure for the highlighter. The data
   structure is ready in JSON and XML formats. I need also JavaBin
 format.
  The
   data structure is already made serializable by extending the
  WritableValue
   class (methods write and resolve).
  
   To receive the custom format on the client via solrj api, the data
   structure needs to be parseable by JavaBinCodec. Is this correct
   assumption? Can we introduce the custom data structure consumer on the
   solrj api without complete overhaul of the api? Is there plugin
 framework
   such that JavaBinCodec is extended and used for the new data
 structure?
  
  
  
   --
   Dmitry Kan
   Luke Toolbox: http://github.com/DmitryKey/luke
   Blog: http://dmitrykan.blogspot.com
   Twitter: http://twitter.com/dmitrykan
   SemanticAnalyzer: www.semanticanalyzer.info
 
 
 
  --
  Regards,
  Shalin Shekhar Mangar.
 
 
 
 
  --
  Dmitry Kan
  Luke Toolbox: http://github.com/DmitryKey/luke
  Blog: http://dmitrykan.blogspot.com
  Twitter: http://twitter.com/dmitrykan
  SemanticAnalyzer: www.semanticanalyzer.info



 --
 Regards,
 Shalin Shekhar Mangar.




-- 
Dmitry Kan
Luke Toolbox: http://github.com/DmitryKey/luke
Blog: http://dmitrykan.blogspot.com
Twitter: http://twitter.com/dmitrykan
SemanticAnalyzer: www.semanticanalyzer.info


Re: Streaming API running a simple query

2015-08-08 Thread Selvam
Hi,

Thanks, good to know, in fact my requirement needs to merge multiple
expressions, while current streaming expressions supports only two
expression. Do you think we can expect that in future versions?
On 07-Aug-2015 6:46 pm, Joel Bernstein joels...@gmail.com wrote:

 Hi,

 There is a new error handling framework in trunk (SOLR-7441) for the
 Streaming API, Streaming Expressions.

 So if you're purely in testing mode, it will be much easier to work in
 trunk then Solr 5.2.

 If you run into errors in trunk that are still confusing please continue to
 report them so we can get all the error messages covered.

 Thanks,

 Joel


 Joel Bernstein
 http://joelsolr.blogspot.com/

 On Fri, Aug 7, 2015 at 6:19 AM, Selvam s.selvams...@gmail.com wrote:

  Hi,
 
  Sorry, it is working now.
 
  curl --data-urlencode
  'stream=search(gettingstarted,q=*:*,fl=id,sort=id asc)'
  http://localhost:8983/solr/gettingstarted/stream
 
  I missed *'asc'* in sort :)
 
  Thanks for the help Shawn Heisey.
 
  On Fri, Aug 7, 2015 at 3:46 PM, Selvam s.selvams...@gmail.com wrote:
 
   Hi,
  
   Thanks for your update, yes, I was missing the cloud mode, I am new to
  the
   world of Solr cloud. Now I have enabled a single node (with two shards
 
   replicas) that runs on 8983 port along with zookeeper running on 9983
  port.
   When I run,
  
curl --data-urlencode
   'stream=search(gettingstarted,q=*:*,fl=id,sort=id)'
   http://localhost:8983/solr/gettingstarted/stream
  
   Again, I get
  
   Unable to construct instance of
   org.apache.solr.client.solrj.io.stream.CloudSolrStream
   .
   .
  
   Caused by: java.lang.reflect.InvocationTargetException
   .
   .
   Caused by: java.lang.ArrayIndexOutOfBoundsException: 1
  
   I tried different port, 9983 as well, which returns Empty reply from
   server. I think I miss some obvious configuration.
  
  
  
  
   On Fri, Aug 7, 2015 at 2:04 PM, Shawn Heisey apa...@elyograg.org
  wrote:
  
   On 8/7/2015 1:37 AM, Selvam wrote:
   
  https://cwiki.apache.org/confluence/display/solr/Streaming+Expressions
   
I tried this from my linux terminal,
1)   curl --data-urlencode
'stream=search(gettingstarted,q=*:*,fl=id,sort=id)'
http://localhost:8983/solr/gettingstarted/stream
   
Threw zkHost error. Then tried with,
   
2)   curl --data-urlencode
   
  
 
 'stream=search(gettingstarted,zkHost=localhost:8983,q=*:*,fl=id,sort=id)'
http://localhost:8983/solr/gettingstarted/stream
   
It throws me java.lang.ArrayIndexOutOfBoundsException: 1\n\tat
   
  
 
 org.apache.solr.client.solrj.io.stream.CloudSolrStream.parseComp(CloudSolrStream.java:260)
  
   The documentation page you linked seems to indicate that this is a
   feature that only works in SolrCloud.  Your inclusion of
   localhost:8983 as the zkHost suggests that either you are NOT
 running
   in cloud mode, or that you do not understand what zkHost means.
  
   Zookeeper runs on a different port than Solr.  8983 is Solr's port.
 If
   you are running a 5.x cloud with the embedded zookeeper, it is most
   likely running on port 9983.  If you are running in cloud mode with a
   properly configured external zookeeper, then your zkHost parameter
 will
   probably have three hosts in it with port 2181.
  
   Thanks,
   Shawn
  
  
  
  
   --
   Regards,
   Selvam
   KnackForge http://knackforge.com
   Acquia Service Partner
   No. 1, 12th Line, K.K. Road, Venkatapuram,
   Ambattur, Chennai,
   Tamil Nadu, India.
   PIN - 600 053.
  
 
 
 
  --
  Regards,
  Selvam
  KnackForge http://knackforge.com
  Acquia Service Partner
  No. 1, 12th Line, K.K. Road, Venkatapuram,
  Ambattur, Chennai,
  Tamil Nadu, India.
  PIN - 600 053.
 



Re: Streaming API running a simple query

2015-08-08 Thread Joel Bernstein
Can you describe your use case?

Joel Bernstein
http://joelsolr.blogspot.com/

On Sat, Aug 8, 2015 at 7:36 AM, Selvam s.selvams...@gmail.com wrote:

 Hi,

 Thanks, good to know, in fact my requirement needs to merge multiple
 expressions, while current streaming expressions supports only two
 expression. Do you think we can expect that in future versions?
 On 07-Aug-2015 6:46 pm, Joel Bernstein joels...@gmail.com wrote:

  Hi,
 
  There is a new error handling framework in trunk (SOLR-7441) for the
  Streaming API, Streaming Expressions.
 
  So if you're purely in testing mode, it will be much easier to work in
  trunk then Solr 5.2.
 
  If you run into errors in trunk that are still confusing please continue
 to
  report them so we can get all the error messages covered.
 
  Thanks,
 
  Joel
 
 
  Joel Bernstein
  http://joelsolr.blogspot.com/
 
  On Fri, Aug 7, 2015 at 6:19 AM, Selvam s.selvams...@gmail.com wrote:
 
   Hi,
  
   Sorry, it is working now.
  
   curl --data-urlencode
   'stream=search(gettingstarted,q=*:*,fl=id,sort=id asc)'
   http://localhost:8983/solr/gettingstarted/stream
  
   I missed *'asc'* in sort :)
  
   Thanks for the help Shawn Heisey.
  
   On Fri, Aug 7, 2015 at 3:46 PM, Selvam s.selvams...@gmail.com wrote:
  
Hi,
   
Thanks for your update, yes, I was missing the cloud mode, I am new
 to
   the
world of Solr cloud. Now I have enabled a single node (with two
 shards
  
replicas) that runs on 8983 port along with zookeeper running on 9983
   port.
When I run,
   
 curl --data-urlencode
'stream=search(gettingstarted,q=*:*,fl=id,sort=id)'
http://localhost:8983/solr/gettingstarted/stream
   
Again, I get
   
Unable to construct instance of
org.apache.solr.client.solrj.io.stream.CloudSolrStream
.
.
   
Caused by: java.lang.reflect.InvocationTargetException
.
.
Caused by: java.lang.ArrayIndexOutOfBoundsException: 1
   
I tried different port, 9983 as well, which returns Empty reply from
server. I think I miss some obvious configuration.
   
   
   
   
On Fri, Aug 7, 2015 at 2:04 PM, Shawn Heisey apa...@elyograg.org
   wrote:
   
On 8/7/2015 1:37 AM, Selvam wrote:

   https://cwiki.apache.org/confluence/display/solr/Streaming+Expressions

 I tried this from my linux terminal,
 1)   curl --data-urlencode
 'stream=search(gettingstarted,q=*:*,fl=id,sort=id)'
 http://localhost:8983/solr/gettingstarted/stream

 Threw zkHost error. Then tried with,

 2)   curl --data-urlencode

   
  
 
 'stream=search(gettingstarted,zkHost=localhost:8983,q=*:*,fl=id,sort=id)'
 http://localhost:8983/solr/gettingstarted/stream

 It throws me java.lang.ArrayIndexOutOfBoundsException: 1\n\tat

   
  
 
 org.apache.solr.client.solrj.io.stream.CloudSolrStream.parseComp(CloudSolrStream.java:260)
   
The documentation page you linked seems to indicate that this is a
feature that only works in SolrCloud.  Your inclusion of
localhost:8983 as the zkHost suggests that either you are NOT
  running
in cloud mode, or that you do not understand what zkHost means.
   
Zookeeper runs on a different port than Solr.  8983 is Solr's port.
  If
you are running a 5.x cloud with the embedded zookeeper, it is most
likely running on port 9983.  If you are running in cloud mode with
 a
properly configured external zookeeper, then your zkHost parameter
  will
probably have three hosts in it with port 2181.
   
Thanks,
Shawn
   
   
   
   
--
Regards,
Selvam
KnackForge http://knackforge.com
Acquia Service Partner
No. 1, 12th Line, K.K. Road, Venkatapuram,
Ambattur, Chennai,
Tamil Nadu, India.
PIN - 600 053.
   
  
  
  
   --
   Regards,
   Selvam
   KnackForge http://knackforge.com
   Acquia Service Partner
   No. 1, 12th Line, K.K. Road, Venkatapuram,
   Ambattur, Chennai,
   Tamil Nadu, India.
   PIN - 600 053.
  
 



Re: Streaming API running a simple query

2015-08-08 Thread Selvam
Hi,

I needed to run a multiple subqueries each with its own limit of rows.

For eg: to get 30 users from country India with age greater than 30 and 50
users from England who are all male.

Thanks again.
On 08-Aug-2015 5:30 pm, Joel Bernstein joels...@gmail.com wrote:

 Can you describe your use case?

 Joel Bernstein
 http://joelsolr.blogspot.com/

 On Sat, Aug 8, 2015 at 7:36 AM, Selvam s.selvams...@gmail.com wrote:

  Hi,
 
  Thanks, good to know, in fact my requirement needs to merge multiple
  expressions, while current streaming expressions supports only two
  expression. Do you think we can expect that in future versions?
  On 07-Aug-2015 6:46 pm, Joel Bernstein joels...@gmail.com wrote:
 
   Hi,
  
   There is a new error handling framework in trunk (SOLR-7441) for the
   Streaming API, Streaming Expressions.
  
   So if you're purely in testing mode, it will be much easier to work in
   trunk then Solr 5.2.
  
   If you run into errors in trunk that are still confusing please
 continue
  to
   report them so we can get all the error messages covered.
  
   Thanks,
  
   Joel
  
  
   Joel Bernstein
   http://joelsolr.blogspot.com/
  
   On Fri, Aug 7, 2015 at 6:19 AM, Selvam s.selvams...@gmail.com wrote:
  
Hi,
   
Sorry, it is working now.
   
curl --data-urlencode
'stream=search(gettingstarted,q=*:*,fl=id,sort=id asc)'
http://localhost:8983/solr/gettingstarted/stream
   
I missed *'asc'* in sort :)
   
Thanks for the help Shawn Heisey.
   
On Fri, Aug 7, 2015 at 3:46 PM, Selvam s.selvams...@gmail.com
 wrote:
   
 Hi,

 Thanks for your update, yes, I was missing the cloud mode, I am new
  to
the
 world of Solr cloud. Now I have enabled a single node (with two
  shards
   
 replicas) that runs on 8983 port along with zookeeper running on
 9983
port.
 When I run,

  curl --data-urlencode
 'stream=search(gettingstarted,q=*:*,fl=id,sort=id)'
 http://localhost:8983/solr/gettingstarted/stream

 Again, I get

 Unable to construct instance of
 org.apache.solr.client.solrj.io.stream.CloudSolrStream
 .
 .

 Caused by: java.lang.reflect.InvocationTargetException
 .
 .
 Caused by: java.lang.ArrayIndexOutOfBoundsException: 1

 I tried different port, 9983 as well, which returns Empty reply
 from
 server. I think I miss some obvious configuration.




 On Fri, Aug 7, 2015 at 2:04 PM, Shawn Heisey apa...@elyograg.org
wrote:

 On 8/7/2015 1:37 AM, Selvam wrote:
 
   
 https://cwiki.apache.org/confluence/display/solr/Streaming+Expressions
 
  I tried this from my linux terminal,
  1)   curl --data-urlencode
  'stream=search(gettingstarted,q=*:*,fl=id,sort=id)'
  http://localhost:8983/solr/gettingstarted/stream
 
  Threw zkHost error. Then tried with,
 
  2)   curl --data-urlencode
 

   
  
 
 'stream=search(gettingstarted,zkHost=localhost:8983,q=*:*,fl=id,sort=id)'
  http://localhost:8983/solr/gettingstarted/stream
 
  It throws me java.lang.ArrayIndexOutOfBoundsException: 1\n\tat
 

   
  
 
 org.apache.solr.client.solrj.io.stream.CloudSolrStream.parseComp(CloudSolrStream.java:260)

 The documentation page you linked seems to indicate that this is a
 feature that only works in SolrCloud.  Your inclusion of
 localhost:8983 as the zkHost suggests that either you are NOT
   running
 in cloud mode, or that you do not understand what zkHost means.

 Zookeeper runs on a different port than Solr.  8983 is Solr's
 port.
   If
 you are running a 5.x cloud with the embedded zookeeper, it is
 most
 likely running on port 9983.  If you are running in cloud mode
 with
  a
 properly configured external zookeeper, then your zkHost parameter
   will
 probably have three hosts in it with port 2181.

 Thanks,
 Shawn




 --
 Regards,
 Selvam
 KnackForge http://knackforge.com
 Acquia Service Partner
 No. 1, 12th Line, K.K. Road, Venkatapuram,
 Ambattur, Chennai,
 Tamil Nadu, India.
 PIN - 600 053.

   
   
   
--
Regards,
Selvam
KnackForge http://knackforge.com
Acquia Service Partner
No. 1, 12th Line, K.K. Road, Venkatapuram,
Ambattur, Chennai,
Tamil Nadu, India.
PIN - 600 053.
   
  
 



Re: how to extend JavaBinCodec and make it available in solrj api

2015-08-08 Thread Shalin Shekhar Mangar
No, I'm afraid you will have to extend the XmlResponseWriter in that case.

On Sat, Aug 8, 2015 at 2:02 PM, Dmitry Kan solrexp...@gmail.com wrote:
 Shalin,

 Thanks, can I also introduce custom entity tags like in my example with the
 highlighter output?

 Dmitry

 On Fri, Aug 7, 2015 at 5:10 PM, Shalin Shekhar Mangar 
 shalinman...@gmail.com wrote:

 The thing is that you are trying to introduce custom xml tags which
 require changing the response writers. Instead, if you just used
 nested maps/lists or SimpleOrderedMap/NamedList then every response
 writer should be able to just directly write the output. Nesting is
 not a problem.

 On Fri, Aug 7, 2015 at 6:09 PM, Dmitry Kan solrexp...@gmail.com wrote:
  Shawn:
 
  thanks, we found an intermediate solution by serializing our data
 structure
  using string representation, perhaps less optimal than using binary
 format
  directly.
 
  In the original router with JavaBinCodec we found, that
  BinaryResponseWriter should also be extended. But the following method is
  static and does allow extending:
 
  public static NamedListObject getParsedResponse(SolrQueryRequest
  req, SolrQueryResponse rsp) {
try {
  Resolver resolver = new Resolver(req, rsp.getReturnFields());
 
  ByteArrayOutputStream out = new ByteArrayOutputStream();
  new JavaBinCodec(resolver).marshal(rsp.getValues(), out);
 
  InputStream in = new ByteArrayInputStream(out.toByteArray());
  return (NamedListObject) new JavaBinCodec(resolver).unmarshal(in);
}
catch (Exception ex) {
  throw new RuntimeException(ex);
}
  }
 
 
 
  Shalin:
 
  We needed new data structure in highlighter with more nested levels,
  than just one. Something like this (in xml representation):
 
  lst name=highlighting
lst name=doc1
  arr name=snippets
snippet
 
   idid1/id
 
   contentsSnippet text goes here/contents
 
   other params/
 
/snippet
 
  /arr
 
 
/lst/lst
 
  Can this be modelled with existing types?
 
 
  On Thu, Aug 6, 2015 at 9:47 PM, Shalin Shekhar Mangar 
  shalinman...@gmail.com wrote:
 
  What do you mean by a custom format? As long as your custom component
  is writing primitives or NamedList/SimpleOrderedMap or collections
  such as List/Map, any response writer should be able to handle them.
 
  On Wed, Aug 5, 2015 at 5:08 PM, Dmitry Kan solrexp...@gmail.com
 wrote:
   Hello,
  
   Solr: 5.2.1
   class: org.apache.solr.common.util.JavaBinCodec
  
   I'm working on a custom data structure for the highlighter. The data
   structure is ready in JSON and XML formats. I need also JavaBin
 format.
  The
   data structure is already made serializable by extending the
  WritableValue
   class (methods write and resolve).
  
   To receive the custom format on the client via solrj api, the data
   structure needs to be parseable by JavaBinCodec. Is this correct
   assumption? Can we introduce the custom data structure consumer on the
   solrj api without complete overhaul of the api? Is there plugin
 framework
   such that JavaBinCodec is extended and used for the new data
 structure?
  
  
  
   --
   Dmitry Kan
   Luke Toolbox: http://github.com/DmitryKey/luke
   Blog: http://dmitrykan.blogspot.com
   Twitter: http://twitter.com/dmitrykan
   SemanticAnalyzer: www.semanticanalyzer.info
 
 
 
  --
  Regards,
  Shalin Shekhar Mangar.
 
 
 
 
  --
  Dmitry Kan
  Luke Toolbox: http://github.com/DmitryKey/luke
  Blog: http://dmitrykan.blogspot.com
  Twitter: http://twitter.com/dmitrykan
  SemanticAnalyzer: www.semanticanalyzer.info



 --
 Regards,
 Shalin Shekhar Mangar.




 --
 Dmitry Kan
 Luke Toolbox: http://github.com/DmitryKey/luke
 Blog: http://dmitrykan.blogspot.com
 Twitter: http://twitter.com/dmitrykan
 SemanticAnalyzer: www.semanticanalyzer.info



-- 
Regards,
Shalin Shekhar Mangar.


Re: how to extend JavaBinCodec and make it available in solrj api

2015-08-08 Thread Shalin Shekhar Mangar
Or use the XsltResponseWriter :)

On Sat, Aug 8, 2015 at 7:51 PM, Shalin Shekhar Mangar
shalinman...@gmail.com wrote:
 No, I'm afraid you will have to extend the XmlResponseWriter in that case.

 On Sat, Aug 8, 2015 at 2:02 PM, Dmitry Kan solrexp...@gmail.com wrote:
 Shalin,

 Thanks, can I also introduce custom entity tags like in my example with the
 highlighter output?

 Dmitry

 On Fri, Aug 7, 2015 at 5:10 PM, Shalin Shekhar Mangar 
 shalinman...@gmail.com wrote:

 The thing is that you are trying to introduce custom xml tags which
 require changing the response writers. Instead, if you just used
 nested maps/lists or SimpleOrderedMap/NamedList then every response
 writer should be able to just directly write the output. Nesting is
 not a problem.

 On Fri, Aug 7, 2015 at 6:09 PM, Dmitry Kan solrexp...@gmail.com wrote:
  Shawn:
 
  thanks, we found an intermediate solution by serializing our data
 structure
  using string representation, perhaps less optimal than using binary
 format
  directly.
 
  In the original router with JavaBinCodec we found, that
  BinaryResponseWriter should also be extended. But the following method is
  static and does allow extending:
 
  public static NamedListObject getParsedResponse(SolrQueryRequest
  req, SolrQueryResponse rsp) {
try {
  Resolver resolver = new Resolver(req, rsp.getReturnFields());
 
  ByteArrayOutputStream out = new ByteArrayOutputStream();
  new JavaBinCodec(resolver).marshal(rsp.getValues(), out);
 
  InputStream in = new ByteArrayInputStream(out.toByteArray());
  return (NamedListObject) new JavaBinCodec(resolver).unmarshal(in);
}
catch (Exception ex) {
  throw new RuntimeException(ex);
}
  }
 
 
 
  Shalin:
 
  We needed new data structure in highlighter with more nested levels,
  than just one. Something like this (in xml representation):
 
  lst name=highlighting
lst name=doc1
  arr name=snippets
snippet
 
   idid1/id
 
   contentsSnippet text goes here/contents
 
   other params/
 
/snippet
 
  /arr
 
 
/lst/lst
 
  Can this be modelled with existing types?
 
 
  On Thu, Aug 6, 2015 at 9:47 PM, Shalin Shekhar Mangar 
  shalinman...@gmail.com wrote:
 
  What do you mean by a custom format? As long as your custom component
  is writing primitives or NamedList/SimpleOrderedMap or collections
  such as List/Map, any response writer should be able to handle them.
 
  On Wed, Aug 5, 2015 at 5:08 PM, Dmitry Kan solrexp...@gmail.com
 wrote:
   Hello,
  
   Solr: 5.2.1
   class: org.apache.solr.common.util.JavaBinCodec
  
   I'm working on a custom data structure for the highlighter. The data
   structure is ready in JSON and XML formats. I need also JavaBin
 format.
  The
   data structure is already made serializable by extending the
  WritableValue
   class (methods write and resolve).
  
   To receive the custom format on the client via solrj api, the data
   structure needs to be parseable by JavaBinCodec. Is this correct
   assumption? Can we introduce the custom data structure consumer on the
   solrj api without complete overhaul of the api? Is there plugin
 framework
   such that JavaBinCodec is extended and used for the new data
 structure?
  
  
  
   --
   Dmitry Kan
   Luke Toolbox: http://github.com/DmitryKey/luke
   Blog: http://dmitrykan.blogspot.com
   Twitter: http://twitter.com/dmitrykan
   SemanticAnalyzer: www.semanticanalyzer.info
 
 
 
  --
  Regards,
  Shalin Shekhar Mangar.
 
 
 
 
  --
  Dmitry Kan
  Luke Toolbox: http://github.com/DmitryKey/luke
  Blog: http://dmitrykan.blogspot.com
  Twitter: http://twitter.com/dmitrykan
  SemanticAnalyzer: www.semanticanalyzer.info



 --
 Regards,
 Shalin Shekhar Mangar.




 --
 Dmitry Kan
 Luke Toolbox: http://github.com/DmitryKey/luke
 Blog: http://dmitrykan.blogspot.com
 Twitter: http://twitter.com/dmitrykan
 SemanticAnalyzer: www.semanticanalyzer.info



 --
 Regards,
 Shalin Shekhar Mangar.



-- 
Regards,
Shalin Shekhar Mangar.


SolrCloud - Error getting leader from zk

2015-08-08 Thread Francisco Andrade
Hello there,

Im getting these errors after an election:
ERROR - 2015-08-08 13:51:05.035; org.apache.solr.cloud.ZkController; Error
getting leader from zk
org.apache.solr.common.SolrException: There is conflicting information
about the leader of shard: shard1 our state
says:http://HOST/solr/COLLECTION/
but zookeeper says:http://ANOTHER_HOST/solr/COLLECTION
at
org.apache.solr.cloud.ZkController.getLeader(ZkController.java:889)
...
INFO  - 2015-08-08 13:51:05.036; org.apache.solr.cloud.ZkController;
publishing core=COLLECTION state=down collection=COLLECTION

Then the host gets registered as down.
Ive tried cleaning the data path and restarting the node, but didint work.

Yesterday I had an issue where I needed to update the collection leader on
the clusterstate.json editing the file in exhibitor.
If i access the console through other nodes the leader showing is the right
one.
But it seems to me that the updated clusterstate.json was not sent properly
to this specific node.

Any suggestions on how to fix it?

Att.,
Francisco Andrade


Re: SolrCloud - Error getting leader from zk

2015-08-08 Thread Shawn Heisey
On 8/8/2015 8:38 AM, Francisco Andrade wrote:
 Yesterday I had an issue where I needed to update the collection leader on
 the clusterstate.json editing the file in exhibitor.
 If i access the console through other nodes the leader showing is the right
 one.
 But it seems to me that the updated clusterstate.json was not sent properly
 to this specific node.
 
 Any suggestions on how to fix it?

First, what version of Solr are you running?  Do you have the same
version on all nodes?

I am not familiar enough with the code to try and debug what the log
messages mean, I will just speak in general terms.

Unless it's the only way to fix a situation that has arisen because of
extraordinary circumstances, it's a bad idea to manually edit what's in
the zookeeper database, and even then, it's a good idea to restart Solr
to be sure it notices the change.

On the topic of your specific edit, if you want to be able to control
which node is the leader, upgrade to the latest Solr version and use the
new preferred leader capability.  I believe it was added in 5.0.

https://cwiki.apache.org/confluence/display/solr/Collections+API#CollectionsAPI-RebalanceLeaders

Something I would try at this point is shutting down and restarting each
Solr instance in your cluster, to be absolutely sure their internal
state agrees with zookeeper.  Restarting Solr instances will shuffle
your leaders around, and you may not end up with the leader assignments
you want.  The preferred leader feature is a much better way to handle this.

Thanks,
Shawn



Re: Streaming API running a simple query

2015-08-08 Thread Joel Bernstein
This sounds doable using nested merge functions like this:

merge(search(...),
   merge(search(...), search(),...), ...)

Joel Bernstein
http://joelsolr.blogspot.com/

On Sat, Aug 8, 2015 at 8:08 AM, Selvam s.selvams...@gmail.com wrote:

 Hi,

 I needed to run a multiple subqueries each with its own limit of rows.

 For eg: to get 30 users from country India with age greater than 30 and 50
 users from England who are all male.

 Thanks again.
 On 08-Aug-2015 5:30 pm, Joel Bernstein joels...@gmail.com wrote:

  Can you describe your use case?
 
  Joel Bernstein
  http://joelsolr.blogspot.com/
 
  On Sat, Aug 8, 2015 at 7:36 AM, Selvam s.selvams...@gmail.com wrote:
 
   Hi,
  
   Thanks, good to know, in fact my requirement needs to merge multiple
   expressions, while current streaming expressions supports only two
   expression. Do you think we can expect that in future versions?
   On 07-Aug-2015 6:46 pm, Joel Bernstein joels...@gmail.com wrote:
  
Hi,
   
There is a new error handling framework in trunk (SOLR-7441) for the
Streaming API, Streaming Expressions.
   
So if you're purely in testing mode, it will be much easier to work
 in
trunk then Solr 5.2.
   
If you run into errors in trunk that are still confusing please
  continue
   to
report them so we can get all the error messages covered.
   
Thanks,
   
Joel
   
   
Joel Bernstein
http://joelsolr.blogspot.com/
   
On Fri, Aug 7, 2015 at 6:19 AM, Selvam s.selvams...@gmail.com
 wrote:
   
 Hi,

 Sorry, it is working now.

 curl --data-urlencode
 'stream=search(gettingstarted,q=*:*,fl=id,sort=id asc)'
 http://localhost:8983/solr/gettingstarted/stream

 I missed *'asc'* in sort :)

 Thanks for the help Shawn Heisey.

 On Fri, Aug 7, 2015 at 3:46 PM, Selvam s.selvams...@gmail.com
  wrote:

  Hi,
 
  Thanks for your update, yes, I was missing the cloud mode, I am
 new
   to
 the
  world of Solr cloud. Now I have enabled a single node (with two
   shards

  replicas) that runs on 8983 port along with zookeeper running on
  9983
 port.
  When I run,
 
   curl --data-urlencode
  'stream=search(gettingstarted,q=*:*,fl=id,sort=id)'
  http://localhost:8983/solr/gettingstarted/stream
 
  Again, I get
 
  Unable to construct instance of
  org.apache.solr.client.solrj.io.stream.CloudSolrStream
  .
  .
 
  Caused by: java.lang.reflect.InvocationTargetException
  .
  .
  Caused by: java.lang.ArrayIndexOutOfBoundsException: 1
 
  I tried different port, 9983 as well, which returns Empty reply
  from
  server. I think I miss some obvious configuration.
 
 
 
 
  On Fri, Aug 7, 2015 at 2:04 PM, Shawn Heisey 
 apa...@elyograg.org
 wrote:
 
  On 8/7/2015 1:37 AM, Selvam wrote:
  

  https://cwiki.apache.org/confluence/display/solr/Streaming+Expressions
  
   I tried this from my linux terminal,
   1)   curl --data-urlencode
   'stream=search(gettingstarted,q=*:*,fl=id,sort=id)'
   http://localhost:8983/solr/gettingstarted/stream
  
   Threw zkHost error. Then tried with,
  
   2)   curl --data-urlencode
  
 

   
  
 
 'stream=search(gettingstarted,zkHost=localhost:8983,q=*:*,fl=id,sort=id)'
   http://localhost:8983/solr/gettingstarted/stream
  
   It throws me java.lang.ArrayIndexOutOfBoundsException:
 1\n\tat
  
 

   
  
 
 org.apache.solr.client.solrj.io.stream.CloudSolrStream.parseComp(CloudSolrStream.java:260)
 
  The documentation page you linked seems to indicate that this
 is a
  feature that only works in SolrCloud.  Your inclusion of
  localhost:8983 as the zkHost suggests that either you are NOT
running
  in cloud mode, or that you do not understand what zkHost means.
 
  Zookeeper runs on a different port than Solr.  8983 is Solr's
  port.
If
  you are running a 5.x cloud with the embedded zookeeper, it is
  most
  likely running on port 9983.  If you are running in cloud mode
  with
   a
  properly configured external zookeeper, then your zkHost
 parameter
will
  probably have three hosts in it with port 2181.
 
  Thanks,
  Shawn
 
 
 
 
  --
  Regards,
  Selvam
  KnackForge http://knackforge.com
  Acquia Service Partner
  No. 1, 12th Line, K.K. Road, Venkatapuram,
  Ambattur, Chennai,
  Tamil Nadu, India.
  PIN - 600 053.
 



 --
 Regards,
 Selvam
 KnackForge http://knackforge.com
 Acquia Service Partner
 No. 1, 12th Line, K.K. Road, Venkatapuram,
 Ambattur, Chennai,
 Tamil Nadu, India.
 PIN - 600 053.

   
  
 



Re: SolrCloud - Error getting leader from zk

2015-08-08 Thread Francisco Andrade
Hi Shawn, thanks for replying.
My solr version is 4.9.0 at all nodes.

I just figured out what was the problem.
When I edited the clusterstate.json on exhibitor, I forgot to also edit the
file located at: /collections  collection_name  leaders  shard
There is also a leader config on that file.

Once I updated the leader node on the file to match the one chosen at
clusterstate.json and restarted the solr nodes that were facing the problem
everything worked fine.

Att.,
Francisco Andrade

On Sat, Aug 8, 2015 at 12:04 PM, Shawn Heisey apa...@elyograg.org wrote:

 On 8/8/2015 8:38 AM, Francisco Andrade wrote:
  Yesterday I had an issue where I needed to update the collection leader
 on
  the clusterstate.json editing the file in exhibitor.
  If i access the console through other nodes the leader showing is the
 right
  one.
  But it seems to me that the updated clusterstate.json was not sent
 properly
  to this specific node.
 
  Any suggestions on how to fix it?

 First, what version of Solr are you running?  Do you have the same
 version on all nodes?

 I am not familiar enough with the code to try and debug what the log
 messages mean, I will just speak in general terms.

 Unless it's the only way to fix a situation that has arisen because of
 extraordinary circumstances, it's a bad idea to manually edit what's in
 the zookeeper database, and even then, it's a good idea to restart Solr
 to be sure it notices the change.

 On the topic of your specific edit, if you want to be able to control
 which node is the leader, upgrade to the latest Solr version and use the
 new preferred leader capability.  I believe it was added in 5.0.


 https://cwiki.apache.org/confluence/display/solr/Collections+API#CollectionsAPI-RebalanceLeaders

 Something I would try at this point is shutting down and restarting each
 Solr instance in your cluster, to be absolutely sure their internal
 state agrees with zookeeper.  Restarting Solr instances will shuffle
 your leaders around, and you may not end up with the leader assignments
 you want.  The preferred leader feature is a much better way to handle
 this.

 Thanks,
 Shawn




Re: docValues

2015-08-08 Thread Nagasharath
I am seeing a significant difference in the query time after using docValue.

I am curious to know what's happening with 'docValue' included in the schema

 On 07-Aug-2015, at 4:31 pm, Shawn Heisey apa...@elyograg.org wrote:
 
 On 8/7/2015 11:47 AM, naga sharathrayapati wrote:
 JVM-Memory has gone up from 3% to 17.1%
 
 In my experience, a healthy Java application (after the heap size has
 stabilized) will have a heap utilization graph where the low points are
 between 50 and 75 percent.  If the low points in heap utilization are
 consistently below 25 percent, you would be better off reducing the heap
 size and allowing the OS to use that memory instead.
 
 If you want to track heap utilization, JVM-Memory in the Solr dashboard
 is a very poor tool.  Use tools like visualvm or jconsole.
 
 https://wiki.apache.org/solr/SolrPerformanceProblems#Java_Heap
 
 I need to add what I said about very low heap utilization to that wiki page.
 
 Thanks,
 Shawn
 


Re: docValues

2015-08-08 Thread Aman Tandon
Hi,


 I am seeing a significant difference in the query time after using docValue

what kind of difference, is it good or bad?

With Regards
Aman Tandon

On Sat, Aug 8, 2015 at 11:38 PM, Nagasharath sharathrayap...@gmail.com
wrote:

 I am seeing a significant difference in the query time after using
 docValue.

 I am curious to know what's happening with 'docValue' included in the
 schema

  On 07-Aug-2015, at 4:31 pm, Shawn Heisey apa...@elyograg.org wrote:
 
  On 8/7/2015 11:47 AM, naga sharathrayapati wrote:
  JVM-Memory has gone up from 3% to 17.1%
 
  In my experience, a healthy Java application (after the heap size has
  stabilized) will have a heap utilization graph where the low points are
  between 50 and 75 percent.  If the low points in heap utilization are
  consistently below 25 percent, you would be better off reducing the heap
  size and allowing the OS to use that memory instead.
 
  If you want to track heap utilization, JVM-Memory in the Solr dashboard
  is a very poor tool.  Use tools like visualvm or jconsole.
 
  https://wiki.apache.org/solr/SolrPerformanceProblems#Java_Heap
 
  I need to add what I said about very low heap utilization to that wiki
 page.
 
  Thanks,
  Shawn
 



Re: docValues

2015-08-08 Thread Nagasharath
Good

Sent from my iPhone

 On 08-Aug-2015, at 8:12 pm, Aman Tandon amantandon...@gmail.com wrote:
 
 Hi,
 
 
 I am seeing a significant difference in the query time after using docValue
 
 what kind of difference, is it good or bad?
 
 With Regards
 Aman Tandon
 
 On Sat, Aug 8, 2015 at 11:38 PM, Nagasharath sharathrayap...@gmail.com
 wrote:
 
 I am seeing a significant difference in the query time after using
 docValue.
 
 I am curious to know what's happening with 'docValue' included in the
 schema
 
 On 07-Aug-2015, at 4:31 pm, Shawn Heisey apa...@elyograg.org wrote:
 
 On 8/7/2015 11:47 AM, naga sharathrayapati wrote:
 JVM-Memory has gone up from 3% to 17.1%
 
 In my experience, a healthy Java application (after the heap size has
 stabilized) will have a heap utilization graph where the low points are
 between 50 and 75 percent.  If the low points in heap utilization are
 consistently below 25 percent, you would be better off reducing the heap
 size and allowing the OS to use that memory instead.
 
 If you want to track heap utilization, JVM-Memory in the Solr dashboard
 is a very poor tool.  Use tools like visualvm or jconsole.
 
 https://wiki.apache.org/solr/SolrPerformanceProblems#Java_Heap
 
 I need to add what I said about very low heap utilization to that wiki
 page.
 
 Thanks,
 Shawn
 


Re: docValues

2015-08-08 Thread Erick Erickson
Have you seen: https://cwiki.apache.org/confluence/display/solr/DocValues?

What kind of speedup? How often are you committing? Is there a speed difference
after a while or on the first few queries?

Details matter a lot for questions like this.

Best,
Erick

On Sat, Aug 8, 2015 at 6:22 PM, Nagasharath sharathrayap...@gmail.com wrote:
 Good

 Sent from my iPhone

 On 08-Aug-2015, at 8:12 pm, Aman Tandon amantandon...@gmail.com wrote:

 Hi,


 I am seeing a significant difference in the query time after using docValue

 what kind of difference, is it good or bad?

 With Regards
 Aman Tandon

 On Sat, Aug 8, 2015 at 11:38 PM, Nagasharath sharathrayap...@gmail.com
 wrote:

 I am seeing a significant difference in the query time after using
 docValue.

 I am curious to know what's happening with 'docValue' included in the
 schema

 On 07-Aug-2015, at 4:31 pm, Shawn Heisey apa...@elyograg.org wrote:

 On 8/7/2015 11:47 AM, naga sharathrayapati wrote:
 JVM-Memory has gone up from 3% to 17.1%

 In my experience, a healthy Java application (after the heap size has
 stabilized) will have a heap utilization graph where the low points are
 between 50 and 75 percent.  If the low points in heap utilization are
 consistently below 25 percent, you would be better off reducing the heap
 size and allowing the OS to use that memory instead.

 If you want to track heap utilization, JVM-Memory in the Solr dashboard
 is a very poor tool.  Use tools like visualvm or jconsole.

 https://wiki.apache.org/solr/SolrPerformanceProblems#Java_Heap

 I need to add what I said about very low heap utilization to that wiki
 page.

 Thanks,
 Shawn