Re: Frequent deletions

2015-01-01 Thread Michael McCandless
Also see this G+ post I wrote up recently showing how %tg deletions
changes over time for an every add also deletes a previous document
stress test: https://plus.google.com/112759599082866346694/posts/MJVueTznYnD

Mike McCandless

http://blog.mikemccandless.com


On Wed, Dec 31, 2014 at 12:21 PM, Erick Erickson
erickerick...@gmail.com wrote:
 It's usually not necessary to optimize, as more indexing happens you
 should see background merges happen that'll reclaim the space, so I
 wouldn't worry about it unless you're seeing actual problems that have
 to be addressed. Here's a great visualization of the process:

 http://blog.mikemccandless.com/2011/02/visualizing-lucenes-segment-merges.html

 See especially the third video, TieredMergePolicy which is the default.

 If you insist, however, try a commit with expungeDeletes=true

 and if that isn't enough, try an optimize call
 you can issue a force merge (aka optimize)  command from the URL (Or
 cUrl etc) as:
 http://localhost:8983/solr/techproducts/update?optimize=true

 But please don't do this unless it's absolutely necessary. You state
 that you have frequent deletions, but eventually this shoul dall
 happen in the background. Optimize is a fairly expensive operation and
 should be used judiciously.

 Best,
 Erick

 On Wed, Dec 31, 2014 at 1:32 AM, ig01 inna.gel...@elbitsystems.com wrote:
 Hello,
 We perform frequent deletions from our index, which greatly increases the
 index size.
 How can we perform an optimization in order to reduce the size.
 Please advise,
 Thanks.




 --
 View this message in context: 
 http://lucene.472066.n3.nabble.com/Frequent-deletions-tp4176689.html
 Sent from the Solr - User mailing list archive at Nabble.com.


Re: Create core problem in tomcat

2015-01-01 Thread Mahmoud Almokadem
You may have a field types in your schema that using stopwords.txt file
like this:

fieldType name=text_general class=solr.TextField
positionIncrementGap=100

 analyzer type=index
tokenizer class=solr.StandardTokenizerFactory/
filter class=solr.LowerCaseFilterFactory/
filter class=solr.ArabicNormalizationFilterFactory/
filter class=solr.StopFilterFactory ignoreCase=true words=
*lang/stopwords_ar.txt* /
filter class=solr.StopFilterFactory ignoreCase=true words=
*lang/stopwords_en.txt* /
filter class=solr.StopFilterFactory ignoreCase=true words=
*stopwords.txt* /
  /analyzer

  analyzer type=query
tokenizer class=solr.StandardTokenizerFactory/
filter class=solr.LowerCaseFilterFactory/
filter class=solr.ArabicNormalizationFilterFactory/
filter class=solr.StopFilterFactory ignoreCase=true words=
*lang/stopwords_ar.txt* /
filter class=solr.StopFilterFactory ignoreCase=true words=
*lang/stopwords_en.txt* /
filter class=solr.StopFilterFactory ignoreCase=true words=
*stopwords.txt* /
  /analyzer

/fieldType

so, you must have files *stopwords_ar.txt* and* stopwords_en.txt* in
INSTANCE_DIR/conf/lang/ and *stopwords.txt* in INSTANCE_DIR/conf/

sincerly,
Mahmoud

On Thu, Jan 1, 2015 at 9:18 AM, Noora noora.sa...@gmail.com wrote:

 Hi
 I'm using apache solr 4.7.2 ant apache tomcat ?
 I can't create core with query in my solr while I cat do it with jetty with
 the same config.
 The first problem was you can pass the system property
 -Dsolr.allow.unsafe.resourceloading=true to your JVM that I solve it in my
 Catalina.sh
 Now my error is :
 Unable to create core: uut8 Caused by: Can't find resource 'stopwords.txt'
 in classpath or conf

 My query is :

 http://10.1.221.210:8983/solr/admin/cores?action=CREATEname=my_coreinstanceDir=my_coredataDir=dataconfigSet=myConfig

 Can any one help me?



Re: SpellCheck (AutoComplete) Not Working In Distributed Environment

2015-01-01 Thread Meraj A. Khan
Shawn,

When running SolrCloud do you even have to include the shards parameter
,shouldnt only shards.qt parameter suffice?
On Dec 30, 2014 7:17 PM, Shawn Heisey apa...@elyograg.org wrote:

 On 12/30/2014 5:03 PM, Charles Sanders wrote:
  Thanks for the suggestion.
 
  I did not do that originally because the documentation states:
  This parameter is not required for the /select request handler.
 
  Which is what I am using. But I gave it a go, even though I'm not
 certain of the shard names. Now I have a NPE.
 
 
 solr/collection1/select?q=kernel+prows=1wt=jsonindent=trueshards.qt=/acshards=shard1,shard2

 If this is not SolrCloud, then the shards parameter must include most of
 the full base URL for each shard that you will be querying.  You can
 only use a bare shard name if you're running SolrCloud.

 The shards.qt parameter that you have used means that when the shards
 are consulted, the /ac handler will be used rather than /select.

 Here's an example of a shards parameter that will combine results from
 three cores on two machines.  When not running SolrCloud, this is how
 you do distributed searching:

 shards=
 idxa2.example.com:8981/solr/ai-inclive,idxa1.example.com:8981/solr/ai-0live,idxa2.example.com:8981/solr/ai-1live

 SolrCloud hides almost all of this complexity.

 Thanks,
 Shawn




Re: Mixing 4.x SolrJ and Solr.war - compatible?

2015-01-01 Thread Gili Nachum
So, It seems I can't upgrade Solr beyond 4.7 as long as I'm running SolrJ
on Java6 JVM.
With any luck I might be able to compile SolrJ that's newer than 4.7 with
Java6. I'll check that next.
Thanks Shawn. That's very helping!

On Wed, Dec 31, 2014 at 6:54 PM, Shawn Heisey apa...@elyograg.org wrote:

 On 12/31/2014 6:23 AM, Gili Nachum wrote:
  Can I use SolrJ v4.7 with the latest 4.x Solr.war?
  Should I switch the writer from Javabin, back to XML to ensure
  compatibility?
 
 http://wiki.apache.org/solr/Solrj#SolrJ.2FSolr_cross-version_compatibility
 
  I'm using CloudSolrServer. My client is running on Java6 so I can't go
  beyond 4.7.

 If you're running SolrCloud, I would not try it.  SolrCloud is evolving
 very quickly and so many things have changed that running mismatched
 versions is likely to break.  I have tried CloudSolrServer from 4.6.0
 against a SolrCloud running 4.2.1, and a simple query will not work
 because of changes in the data contained in zookeeper.

 If you're not using CloudSolrServer, it should work very well even with
 a wide version discrepancy.

 Switching to XML is not required.  The javabin version has only changed
 once, in version 3.1.0.  It has been stable since that time.

 Thanks,
 Shawn




Re: Mixing 4.x SolrJ and Solr.war - compatible?

2015-01-01 Thread Shawn Heisey
On 1/1/2015 6:34 AM, Gili Nachum wrote:
 So, It seems I can't upgrade Solr beyond 4.7 as long as I'm running SolrJ
 on Java6 JVM.
 With any luck I might be able to compile SolrJ that's newer than 4.7 with
 Java6. I'll check that next.
 Thanks Shawn. That's very helping!

Solr 4.8 and later (compared to 4.7) have a very large number of code
changes that will not compile under Java 6.  It would be a *major*
undertaking to change those back.  When the project declared that Java 7
was the minimum version, we weren't just making a statement ... it
really did become the minimum requirement.

Oracle has stopped providing support for Java 6.  Java 7 will also reach
end of support in April 2015, so you might actually want to consider
moving to Java 8, which has even more performance improvements over Java 7.

Along with the latest Java 7 or Java 8, you should probably change your
garbage collection tuning to G1.  With a lot of invaluable help from
Oracle employees, I've worked out a good set of parameters:

https://wiki.apache.org/solr/ShawnHeisey#GC_Tuning

Thanks,
Shawn



Re: Queries not supported by Lucene Query Parser syntax

2015-01-01 Thread Jack Krupansky
Yes, you are always limited by the query parser syntax, but of course you
can always write your own query parser as well.

There is an open issue for an XML-based query parser that would give you
greater control. but... it's not committed yet:
https://issues.apache.org/jira/browse/SOLR-839

-- Jack Krupansky

On Thu, Jan 1, 2015 at 4:08 AM, Leonid Bolshinsky leonid...@gmail.com
wrote:

 Hello,

 Are we always limited by the query parser syntax when passing a query
 string to Solr?
 What about the query elements which are not supported by the syntax?
 For example, BooleanQuery.setMinimumNumberShouldMatch(n) is translated by
 BooleanQuery.toString() into ~n. But this is not a valid query syntax. So
 how can we express this via query syntax in Solr?

 And more general question:
 Given a Lucene Query object which was built programatically by a legacy
 code (which is using Lucene and not Solr), is there any way to translate it
 into Solr query (which must be a string). As Query.toString() doesn't have
 to be a valid Lucene query syntax, does it mean that the Solr query string
 must to be manually translated from the Lucene query object? Is there any
 utility that performs this job? And, again, what about queries not
 supported by the query syntax, like CustomScoreQuery, PayloadTermQuery
 etc.? Are we always limited in Solr by the query parser syntax?

 Thanks,
 Leonid



Re: Queries not supported by Lucene Query Parser syntax

2015-01-01 Thread David Philip
Hi Leonid,

   Have you had a look at edismax query parser[1]? Isn't that any use to
your requirement? I am not sure whether it is something that you are
looking for. But the question seemed to be having a query related to that.


[1] http://wiki.apache.org/solr/ExtendedDisMax#Query_Syntax



On Thu, Jan 1, 2015 at 2:38 PM, Leonid Bolshinsky leonid...@gmail.com
wrote:

 Hello,

 Are we always limited by the query parser syntax when passing a query
 string to Solr?
 What about the query elements which are not supported by the syntax?
 For example, BooleanQuery.setMinimumNumberShouldMatch(n) is translated by
 BooleanQuery.toString() into ~n. But this is not a valid query syntax. So
 how can we express this via query syntax in Solr?

 And more general question:
 Given a Lucene Query object which was built programatically by a legacy
 code (which is using Lucene and not Solr), is there any way to translate it
 into Solr query (which must be a string). As Query.toString() doesn't have
 to be a valid Lucene query syntax, does it mean that the Solr query string
 must to be manually translated from the Lucene query object? Is there any
 utility that performs this job? And, again, what about queries not
 supported by the query syntax, like CustomScoreQuery, PayloadTermQuery
 etc.? Are we always limited in Solr by the query parser syntax?

 Thanks,
 Leonid



Queries not supported by Lucene Query Parser syntax

2015-01-01 Thread Leonid Bolshinsky
Hello,

Are we always limited by the query parser syntax when passing a query
string to Solr?
What about the query elements which are not supported by the syntax?
For example, BooleanQuery.setMinimumNumberShouldMatch(n) is translated by
BooleanQuery.toString() into ~n. But this is not a valid query syntax. So
how can we express this via query syntax in Solr?

And more general question:
Given a Lucene Query object which was built programatically by a legacy
code (which is using Lucene and not Solr), is there any way to translate it
into Solr query (which must be a string). As Query.toString() doesn't have
to be a valid Lucene query syntax, does it mean that the Solr query string
must to be manually translated from the Lucene query object? Is there any
utility that performs this job? And, again, what about queries not
supported by the query syntax, like CustomScoreQuery, PayloadTermQuery
etc.? Are we always limited in Solr by the query parser syntax?

Thanks,
Leonid


ignoring bad documents during index

2015-01-01 Thread SolrUser1543
Suppose I need to index a bulk of several documents ( D1 D2 D3 D4 )  - 4
documents in one request.

If e.g. D3 was an incorrect , so exception will be thrown and HTTP response
with 400 bad request will be returned .

Documents D1 and D2 will be indexed, but  D4 not . Also no indication will
be returned .

1. If it is possible to ignore such an error and continue to index D4 ? 
2. What will the best way to add an information about failed documents ? I
thought about an update processor , with try / catch in addCommand and in
case of exception add a doc ID to response .
Or it may be better to implement a component or response writer to add the
info ? 




--
View this message in context: 
http://lucene.472066.n3.nabble.com/ignoring-bad-documents-during-index-tp4176947.html
Sent from the Solr - User mailing list archive at Nabble.com.


Garbage Collection tuning - G1 is now a good option

2015-01-01 Thread Shawn Heisey
I've been working with Oracle employees to find better GC tuning
options.  The results are good enough to share with the community:

https://wiki.apache.org/solr/ShawnHeisey#GC_Tuning

With the latest Java 7 or Java 8 version, and a couple of tuning
options, G1GC has grown up enough to be a viable choice.  Two of the
settings on that list were critical for making the performance
acceptable with my testing: ParallelRefProcEnabled and G1HeapRegionSize.

I've included some notes on the wiki about how you can size the G1 heap
regions appropriately for your own index.

Thanks,
Shawn



Re: Queries not supported by Lucene Query Parser syntax

2015-01-01 Thread Ahmet Arslan
Hi Lenoid,

Here is another un-committed parser : 
https://issues.apache.org/jira/browse/LUCENE-5205

Ahmet


On Thursday, January 1, 2015 5:59 PM, Roman Chyla roman.ch...@gmail.com wrote:



Hi Leonid,

I didn't look into solr qparser for a long time, but I think you should be
able to combine different query parsers in one query. Look at the
SolrQueryParser code, maybe now you can specify custom query parser for
every clause (?), st like:

foo AND {!lucene}bar

I dont know, but worth exploring

There is an another implementation of a query language, for which I know it
allows to combine different query parsers in one (cause I wrote it), there
the query goes this way:

edismax(dog cat AND lucene((foo AND bar)~3))

meaning: use edismax to build the main query, but let lucene query parser
build the 3rd clause - the nested 'for and bar' (parsers are expressed as
function operators, so you can use any query parser there exist in SOLR)

it is here, https://issues.apache.org/jira/browse/LUCENE-5014, but that was
not reviewed/integrated either


So no, you are not always limited by the query parser - you can combine
them (in more or less limited fashion). But yes, the query parsers limit
the expressiveness of your query language, but not what can be searched
(they will all produce Query object).

Best,

  roman




On Thu, Jan 1, 2015 at 10:15 AM, Jack Krupansky jack.krupan...@gmail.com
wrote:

 Yes, you are always limited by the query parser syntax, but of course you
 can always write your own query parser as well.

 There is an open issue for an XML-based query parser that would give you
 greater control. but... it's not committed yet:
 https://issues.apache.org/jira/browse/SOLR-839

 -- Jack Krupansky

 On Thu, Jan 1, 2015 at 4:08 AM, Leonid Bolshinsky leonid...@gmail.com
 wrote:

  Hello,
 
  Are we always limited by the query parser syntax when passing a query
  string to Solr?
  What about the query elements which are not supported by the syntax?
  For example, BooleanQuery.setMinimumNumberShouldMatch(n) is translated by
  BooleanQuery.toString() into ~n. But this is not a valid query syntax. So
  how can we express this via query syntax in Solr?
 
  And more general question:
  Given a Lucene Query object which was built programatically by a legacy
  code (which is using Lucene and not Solr), is there any way to translate
 it
  into Solr query (which must be a string). As Query.toString() doesn't
 have
  to be a valid Lucene query syntax, does it mean that the Solr query
 string
  must to be manually translated from the Lucene query object? Is there any
  utility that performs this job? And, again, what about queries not
  supported by the query syntax, like CustomScoreQuery, PayloadTermQuery
  etc.? Are we always limited in Solr by the query parser syntax?
 
  Thanks,
  Leonid
 



Re: Frequent deletions

2015-01-01 Thread Alexandre Rafalovitch
Is there a specific list of which data structures are sparce and
non-sparce for Lucene and Solr (referencing G+ post)? I imagine this
is obvious to low-level hackers, but could actually be nice to
summarize it somewhere for troubleshooting.

Regards,
   Alex.

Sign up for my Solr resources newsletter at http://www.solr-start.com/


On 1 January 2015 at 05:22, Michael McCandless
luc...@mikemccandless.com wrote:
 Also see this G+ post I wrote up recently showing how %tg deletions
 changes over time for an every add also deletes a previous document
 stress test: https://plus.google.com/112759599082866346694/posts/MJVueTznYnD

 Mike McCandless

 http://blog.mikemccandless.com


 On Wed, Dec 31, 2014 at 12:21 PM, Erick Erickson
 erickerick...@gmail.com wrote:
 It's usually not necessary to optimize, as more indexing happens you
 should see background merges happen that'll reclaim the space, so I
 wouldn't worry about it unless you're seeing actual problems that have
 to be addressed. Here's a great visualization of the process:

 http://blog.mikemccandless.com/2011/02/visualizing-lucenes-segment-merges.html

 See especially the third video, TieredMergePolicy which is the default.

 If you insist, however, try a commit with expungeDeletes=true

 and if that isn't enough, try an optimize call
 you can issue a force merge (aka optimize)  command from the URL (Or
 cUrl etc) as:
 http://localhost:8983/solr/techproducts/update?optimize=true

 But please don't do this unless it's absolutely necessary. You state
 that you have frequent deletions, but eventually this shoul dall
 happen in the background. Optimize is a fairly expensive operation and
 should be used judiciously.

 Best,
 Erick

 On Wed, Dec 31, 2014 at 1:32 AM, ig01 inna.gel...@elbitsystems.com wrote:
 Hello,
 We perform frequent deletions from our index, which greatly increases the
 index size.
 How can we perform an optimization in order to reduce the size.
 Please advise,
 Thanks.




 --
 View this message in context: 
 http://lucene.472066.n3.nabble.com/Frequent-deletions-tp4176689.html
 Sent from the Solr - User mailing list archive at Nabble.com.


Re: Queries not supported by Lucene Query Parser syntax

2015-01-01 Thread Roman Chyla
Hi Leonid,

I didn't look into solr qparser for a long time, but I think you should be
able to combine different query parsers in one query. Look at the
SolrQueryParser code, maybe now you can specify custom query parser for
every clause (?), st like:

foo AND {!lucene}bar

I dont know, but worth exploring

There is an another implementation of a query language, for which I know it
allows to combine different query parsers in one (cause I wrote it), there
the query goes this way:

edismax(dog cat AND lucene((foo AND bar)~3))

meaning: use edismax to build the main query, but let lucene query parser
build the 3rd clause - the nested 'for and bar' (parsers are expressed as
function operators, so you can use any query parser there exist in SOLR)

it is here, https://issues.apache.org/jira/browse/LUCENE-5014, but that was
not reviewed/integrated either


So no, you are not always limited by the query parser - you can combine
them (in more or less limited fashion). But yes, the query parsers limit
the expressiveness of your query language, but not what can be searched
(they will all produce Query object).

Best,

  roman



On Thu, Jan 1, 2015 at 10:15 AM, Jack Krupansky jack.krupan...@gmail.com
wrote:

 Yes, you are always limited by the query parser syntax, but of course you
 can always write your own query parser as well.

 There is an open issue for an XML-based query parser that would give you
 greater control. but... it's not committed yet:
 https://issues.apache.org/jira/browse/SOLR-839

 -- Jack Krupansky

 On Thu, Jan 1, 2015 at 4:08 AM, Leonid Bolshinsky leonid...@gmail.com
 wrote:

  Hello,
 
  Are we always limited by the query parser syntax when passing a query
  string to Solr?
  What about the query elements which are not supported by the syntax?
  For example, BooleanQuery.setMinimumNumberShouldMatch(n) is translated by
  BooleanQuery.toString() into ~n. But this is not a valid query syntax. So
  how can we express this via query syntax in Solr?
 
  And more general question:
  Given a Lucene Query object which was built programatically by a legacy
  code (which is using Lucene and not Solr), is there any way to translate
 it
  into Solr query (which must be a string). As Query.toString() doesn't
 have
  to be a valid Lucene query syntax, does it mean that the Solr query
 string
  must to be manually translated from the Lucene query object? Is there any
  utility that performs this job? And, again, what about queries not
  supported by the query syntax, like CustomScoreQuery, PayloadTermQuery
  etc.? Are we always limited in Solr by the query parser syntax?
 
  Thanks,
  Leonid
 



Re: Queries not supported by Lucene Query Parser syntax

2015-01-01 Thread Mikhail Khludnev
Hello Leonid,

Yep. This problem exists and makes hard the migration from Lucene to Solr.
You might be interested in Parboiled
http://www.youtube.com/watch?v=DXiRYfFGHJE
The simplest way to solve it is to serialize Lucene Query instance into
parameter or request body. Unfortunately, Query is not Serializable, but
it's possible to do this with non-invasive serializers like XStream. Then,
QParserPlugin can read this param or a body and deserialize Lucene query
instance.
Have a good hack!

On Thu, Jan 1, 2015 at 12:08 PM, Leonid Bolshinsky leonid...@gmail.com
wrote:

 Hello,

 Are we always limited by the query parser syntax when passing a query
 string to Solr?
 What about the query elements which are not supported by the syntax?
 For example, BooleanQuery.setMinimumNumberShouldMatch(n) is translated by
 BooleanQuery.toString() into ~n. But this is not a valid query syntax. So
 how can we express this via query syntax in Solr?

 And more general question:
 Given a Lucene Query object which was built programatically by a legacy
 code (which is using Lucene and not Solr), is there any way to translate it
 into Solr query (which must be a string). As Query.toString() doesn't have
 to be a valid Lucene query syntax, does it mean that the Solr query string
 must to be manually translated from the Lucene query object? Is there any
 utility that performs this job? And, again, what about queries not
 supported by the query syntax, like CustomScoreQuery, PayloadTermQuery
 etc.? Are we always limited in Solr by the query parser syntax?

 Thanks,
 Leonid




-- 
Sincerely yours
Mikhail Khludnev
Principal Engineer,
Grid Dynamics

http://www.griddynamics.com
mkhlud...@griddynamics.com


Re: ignoring bad documents during index

2015-01-01 Thread Mikhail Khludnev
Hello,
Please find below

On Thu, Jan 1, 2015 at 11:59 PM, SolrUser1543 osta...@gmail.com wrote:

 1. If it is possible to ignore such an error and continue to index D4 ?

this can be done by catching and swallowing an exception in custom
UpdateRequestProcessor


 2. What will the best way to add an information about failed documents ? I
 thought about an update processor , with try / catch in addCommand and in
 case of exception add a doc ID to response .
 Or it may be better to implement a component or response writer to add the
 info ?

it turns that you can add this info into SolrQueryResponse.getValues() even
in custom UpdateRequestProcessor and it should be responded back.






 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/ignoring-bad-documents-during-index-tp4176947.html
 Sent from the Solr - User mailing list archive at Nabble.com.




-- 
Sincerely yours
Mikhail Khludnev
Principal Engineer,
Grid Dynamics

http://www.griddynamics.com
mkhlud...@griddynamics.com


Re: Garbage Collection tuning - G1 is now a good option

2015-01-01 Thread William Bell
But tons of people on this mailing list do not recommend AggressiveOpts

Why do you recommend it?

On Thu, Jan 1, 2015 at 12:10 PM, Shawn Heisey apa...@elyograg.org wrote:

 I've been working with Oracle employees to find better GC tuning
 options.  The results are good enough to share with the community:

 https://wiki.apache.org/solr/ShawnHeisey#GC_Tuning

 With the latest Java 7 or Java 8 version, and a couple of tuning
 options, G1GC has grown up enough to be a viable choice.  Two of the
 settings on that list were critical for making the performance
 acceptable with my testing: ParallelRefProcEnabled and G1HeapRegionSize.

 I've included some notes on the wiki about how you can size the G1 heap
 regions appropriately for your own index.

 Thanks,
 Shawn




-- 
Bill Bell
billnb...@gmail.com
cell 720-256-8076


UseLargePages

2015-01-01 Thread William Bell
Do you think setting aside 2GB for UseLargePages would generally help
indexing or not?

I can imaging it might help

-- 
Bill Bell
billnb...@gmail.com
cell 720-256-8076


Re: poor performance when connecting to CloudSolrServer(zkHosts) using solrJ

2015-01-01 Thread Mohmed Hussain
My two cents, do check network connectivity. In past I remember changing
the zookeeper server name to actual IP improved the speed a bit.
DNS sometimes take time to resolve hostname. Could be worth trying this
option.


Thanks
-Hussain

On Mon, Dec 29, 2014 at 6:31 PM, Shawn Heisey apa...@elyograg.org wrote:

 On 12/29/2014 6:52 PM, zhangjia...@dcits.com wrote:
I setups a SolrCloud, and code a simple solrJ program to query solr
  data as below, but it takes about 40 seconds to new CloudSolrServer
  instance,less than 100 miliseconds is acceptable. what is going on when
 new
  CloudSolrServer? and how to fix this issue?
 
String zkHost = bicenter1.dcc:2181,datanode2.dcc:2181;
String defaultCollection = hdfsCollection;
 
long startms=System.currentTimeMillis();
CloudSolrServer server = new CloudSolrServer(zkHost);
server.setDefaultCollection(defaultCollection);
server.setZkConnectTimeout(3000);
server.setZkClientTimeout(6000);
long endms=System.currentTimeMillis();
System.out.println(endms-startms);
 
ModifiableSolrParams params = new ModifiableSolrParams();
params.set(q, id:*hbase*);
params.set(sort, price desc);
params.set(start, 0);
params.set(rows, 10);
 
try {
QueryResponse response=server.query(params);
SolrDocumentList results = response.getResults();
for (SolrDocument doc:results) {
String rowkey=doc.getFieldValue(id).toString();
}
 
} catch (SolrServerException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
 
server.shutdown();

 The only part of the constructor for CloudSolrServer that I cannot
 easily look at is the part that creates the httpclient, because
 ultimately that calls code outside of Solr, in the HttpComponents
 project.  Everything that I *can* see is code that should happen
 extremely quickly, and the httpclient creation code is something that I
 have used myself and never had any noticeable delay.  The constructor
 for CloudSolrServer does *NOT* contact zookeeper or Solr, it merely sets
 up the instance.  Nothing is contacted until a request is made.  I
 examined the CloudSolrServer code from branch_5x.

 I tried out your code (with SolrJ 4.6.0 against a SolrCloud 4.2.1
 cluster).  Although the query itself encountered an exception in
 zookeeper (probably from the version discrepancy between Solr and
 SolrJ), the elapsed time printed out from the CloudSolrServer
 initialization was 240 milliseconds on the first run, 60 milliseconds on
 a second run, and 64 milliseconds on a third run.  Those are all MUCH
 less than the 1000 milliseconds that would represent one second, and
 incredibly less than the 4 milliseconds that would represent 40
 seconds.

 Side issue:  I hope that you have more than two zookeeper servers in
 your ensemble.  A two-node zookeeper ensemble is actually *less*
 reliable than a single node, because a failure of EITHER of those two
 nodes will result in a loss of quorum.  Three nodes is the minimum
 required for a redundant zookeeper ensemble.

 Thanks,
 Shawn




RE: poor performance when connecting to CloudSolrServer(zkHosts) using solrJ

2015-01-01 Thread steve
While I'm not a net optimization whiz, a properly configured DNS client will 
cache the recent resolved lookups; this way even though you are referring to 
the Fully Qualified Domain Name (FQDN), the local DNS client will return the 
recently acquired IP address (within the constraints of the Domain's 
configuration). In other words, while there is overhead between the local 
workstation/computer and the DNS client, it will NOT require access to the 
configured DNS server upstream.
Enjoy,Steve

 Date: Thu, 1 Jan 2015 14:30:19 -0800
 Subject: Re: poor performance when connecting to CloudSolrServer(zkHosts) 
 using solrJ
 From: mohd.huss...@gmail.com
 To: solr-user@lucene.apache.org
 
 My two cents, do check network connectivity. In past I remember changing
 the zookeeper server name to actual IP improved the speed a bit.
 DNS sometimes take time to resolve hostname. Could be worth trying this
 option.
 
 
 Thanks
 -Hussain
 
 On Mon, Dec 29, 2014 at 6:31 PM, Shawn Heisey apa...@elyograg.org wrote:
 
  On 12/29/2014 6:52 PM, zhangjia...@dcits.com wrote:
 I setups a SolrCloud, and code a simple solrJ program to query solr
   data as below, but it takes about 40 seconds to new CloudSolrServer
   instance,less than 100 miliseconds is acceptable. what is going on when
  new
   CloudSolrServer? and how to fix this issue?
  
 String zkHost = bicenter1.dcc:2181,datanode2.dcc:2181;
 String defaultCollection = hdfsCollection;
  
 long startms=System.currentTimeMillis();
 CloudSolrServer server = new CloudSolrServer(zkHost);
 server.setDefaultCollection(defaultCollection);
 server.setZkConnectTimeout(3000);
 server.setZkClientTimeout(6000);
 long endms=System.currentTimeMillis();
 System.out.println(endms-startms);
  
 ModifiableSolrParams params = new ModifiableSolrParams();
 params.set(q, id:*hbase*);
 params.set(sort, price desc);
 params.set(start, 0);
 params.set(rows, 10);
  
 try {
 QueryResponse response=server.query(params);
 SolrDocumentList results = response.getResults();
 for (SolrDocument doc:results) {
 String rowkey=doc.getFieldValue(id).toString();
 }
  
 } catch (SolrServerException e) {
 // TODO Auto-generated catch block
 e.printStackTrace();
 }
  
 server.shutdown();
 
  The only part of the constructor for CloudSolrServer that I cannot
  easily look at is the part that creates the httpclient, because
  ultimately that calls code outside of Solr, in the HttpComponents
  project.  Everything that I *can* see is code that should happen
  extremely quickly, and the httpclient creation code is something that I
  have used myself and never had any noticeable delay.  The constructor
  for CloudSolrServer does *NOT* contact zookeeper or Solr, it merely sets
  up the instance.  Nothing is contacted until a request is made.  I
  examined the CloudSolrServer code from branch_5x.
 
  I tried out your code (with SolrJ 4.6.0 against a SolrCloud 4.2.1
  cluster).  Although the query itself encountered an exception in
  zookeeper (probably from the version discrepancy between Solr and
  SolrJ), the elapsed time printed out from the CloudSolrServer
  initialization was 240 milliseconds on the first run, 60 milliseconds on
  a second run, and 64 milliseconds on a third run.  Those are all MUCH
  less than the 1000 milliseconds that would represent one second, and
  incredibly less than the 4 milliseconds that would represent 40
  seconds.
 
  Side issue:  I hope that you have more than two zookeeper servers in
  your ensemble.  A two-node zookeeper ensemble is actually *less*
  reliable than a single node, because a failure of EITHER of those two
  nodes will result in a loss of quorum.  Three nodes is the minimum
  required for a redundant zookeeper ensemble.
 
  Thanks,
  Shawn