Re: Result merging takes too long

2014-03-16 Thread remi tassing
So I have to ask what the end goal is here.
In our case, the purpose of sharding was/is to speed up the process.
We've noticed that as the index size was growing, response speed kept going
down so we decided to split the index across 5 machines.

Are your response times really in need of improvement or is this more
trying to understand the process?
Our response time went from 1second to 5+ seconds, so we thought we could
definitely do better with Solr(Cloud).

'start' and 'rows' are generally set to the default values (i.e., 0 and 10
respectively).

Any clues how to conduct further investigations?


On Sun, Mar 16, 2014 at 7:29 AM, Erick Erickson erickerick...@gmail.comwrote:

 I wouldn't expect the merge times to be significant
 at all, _assuming_ you're not doing something like
 setting a very high start= parameter or returning
 a whole of rows.

 Now, it may be that you're sharding with too small
 a document set to really notice a difference.
 Sharding isn't really about speeding up responses,
 as it is being able to handle very large indexes.

 So I have to ask what the end goal is here. Are
 your response times really in need of improvement
 or is this more trying to understand the process?

 Best,
 Erick

 On Thu, Mar 13, 2014 at 1:19 AM, remi tassing tassingr...@gmail.com
 wrote:
  Hi Erick,
 
  I've used the fl=id parameter to avoid retrieving the actual documents
  (step 4 in your mail) but the problem still exists.
  Any ideas on how to find the merging time(step 3)?
 
  Remi
 
 
  On Tue, Mar 11, 2014 at 7:29 PM, Erick Erickson erickerick...@gmail.com
 wrote:
 
  In SolrCloud there are a couple of round trips
  that _may_ be what you're seeing.
 
  First, though, the QTime is the time spent
  querying, it does NOT include assembling
  the documents from disk for return etc., so
  bear that in mind
 
  But here's the sequence as I understand it
  from the receiving node's viewpoint.
  1 send the query out to one replica for
  each shard
  2 get the top N doc IDs and scores (
  or whatever sorting criteria) from each
  shard.
  3 Merge the lists and select the top N
  to return
  4 request the actual documents for
  the top N list from each of the shards
  5 return the list.
 
  So as you can see, there's an extra
  round trip to each shard to get the
  full document. Perhaps this is what
  you're seeing? 4 seems like it
  might be what you're seeing, I don't
  think it's counted in QTime.
 
  HTH
  Erick
 
  On Tue, Mar 11, 2014 at 3:17 AM, remi tassing tassingr...@gmail.com
  wrote:
   Hi,
  
   I've just setup a SolrCloud with Tomcat. 5 Shards with one replication
  each
   and total 10million docs (evenly distributed).
  
   I've noticed the query response time is faster than using one single
 node
   but still not as fast as I expected.
  
   After turning debugQuery on, I noticed the query time is different to
 the
   value returned in the debug explanation (see some excerpt below). More
   importantly, while making a query to one, and only one, shard then the
   result is consistent. It appears the server spends most of its time
 doing
   result aggregation (merging).
  
   After searching on Google in vain I didn't find anything concrete
 except
   that the problem could be in 'SearchComponent'.
  
   Could you point me in the right direction (e.g. configuration...)?
  
   Thanks!
  
   Remi
  
   Solr Cloud result:
  
   lst name=responseHeader
  
   int name=status0/int
  
   int name=QTime3471/int
  
   lst name=params
  
   str name=debugQueryon/str
  
   str name=qproject development agile/str
  
   /lst
  
   /lst
  
   result name=response numFound=2762803 start=0
   maxScore=0.17022902.../result
  
   ...
  
  
  
   lst name=timing
  
   double name=time508.0/double
  
   lst name=prepare
  
   double name=time8.0/double
  
   lst name=query
  
   double name=time8.0/double
  
   /lst
  
   lst name=facet
  
   double name=time0.0/double
  
   /lst
  
   lst name=mlt
  
   double name=time0.0/double
  
   /lst
  
   lst name=highlight
  
   double name=time0.0/double
  
   /lst
  
   lst name=stats
  
   double name=time0.0/double
  
   /lst
  
   lst name=debug
  
   double name=time0.0/double
  
   /lst
  
   /lst
  
   lst name=process
  
   double name=time499.0/double
  
   lst name=query
  
   double name=time195.0/double
  
   /lst
  
   lst name=facet
  
   double name=time0.0/double
  
   /lst
  
   lst name=mlt
  
   double name=time0.0/double
  
   /lst
  
   lst name=highlight
  
   double name=time228.0/double
  
   /lst
  
   lst name=stats
  
   double name=time0.0/double
  
   /lst
  
   lst name=debug
  
   double name=time76.0/double
  
   /lst
  
   /lst
  
   /lst
 



Problems using solr.SpatialRecursivePrefixTreeFieldType

2014-03-16 Thread Hamish Campbell
Hey all,

Trying to use SpatialRecursivePrefixTreeFieldType to store extent polygons
but I can't seem to get it configured correctly. Hitting this on start up:

4670 [main] ERROR org.apache.solr.core.SolrCore  - Error loading
 core:java.util.concurrent.ExecutionException:
 java.lang.NoClassDefFoundError:
 com/vividsolutions/jts/geom/CoordinateSequenceFactory


Per the manual, and David Smiley's previous responses, I've added the jst
.jar files to WEB-INF/lib in the solr .war file.

I'm still getting the error above, any other clues?

-- 
Hamish Campbell
Koordinates Ltd http://koordinates.com/?_bzhc=esig
PH   +64 9 966 0433
FAX +64 9 966 0045


Nested documents, block join - re-indexing a single document upon update

2014-03-16 Thread danny teichthal
 Hi All,


 To make things short, I would like to use block joins, but to be able to
index each document on the block separately.

Is it possible?



In more details:



We have some nested parent-child structure where:

1.   Parent may have a single level of children

2.   Parent and child documents may be updated independently.

3.   We may want to search for parent by child info and vise versa.



At first we thought of managing the parent and child in different
documents, de-normalizing child data at parent level and parent data on
child level.

After reading *Mikhail*http://www.blogger.com/profile/03731629466352186647blog
http://blog.griddynamics.com/2013/09/solr-block-join-support.html, we
thought of using the block join for this purpose.



But, I got into a wall when trying to update a single child document.

For me, it's ok if SOLR will internally index the whole block, I just don't
want to fetch the whole hierarchy from DB for update.



I was trying to achieve this using atomic updates - since all the fields
must be stored anyway - if I send an atomic update on one of the children
with the _*root*_ field then there's no need to send the whole hierarchy.

But, when I try this method, I see that the child document is indeed
updated, but it's order is changed to be after the parent.



This is what I did:

1.   Change the root field to be stored - field name=_root_
type=string indexed=true stored=true/

2.   Put attached docs on example\exampledocs.

3.   Run post.jar on parent-child.xml

4.   Run post.jar on update-child-atomic.xml.

5.   Now -
http://localhost:8983/solr/collection1/select?q=%7B!parent+which%3D%27type_s%3Aparent%27%7D%2BCOLOR_s%3ARed+%2BSIZE_s%3AXL%0Awt=jsonindent=true,
returns parent 10 as expected.

6.   But,
http://localhost:8983/solr/collection1/select?q={!parent+which%3D'type_s%3Aparent'}%2BCOLOR_s%3AGreen+%2BSIZE_s%3AXL%0Awt=jsonindent=truehttp://localhost:8983/solr/collection1/select?q=%7b!parent+which%3D'type_s%3Aparent'%7d%2BCOLOR_s%3AGreen+%2BSIZE_s%3AXL%0Awt=jsonindent=true-
returns nothing.

7.   When searching *:* on Admin,   record with id=12 was updated with
'Green', but it is returned below the parent record.

8.



Thanks in advance.



In case the attachments does not work:

1st file to post:



update

  deletequery*:*/query/delete

  add

doc

  field name=id10/field

  field name=type_sparent/field

  field name=BRAND_sNike/field

  doc

field name=id11/field

field name=COLOR_sRed/field

field name=SIZE_sXL/field

  /doc

  doc

field name=id12/field

field name=COLOR_sBlue/field

field name=SIZE_sXL/field

  /doc

/doc

  /add

  commit/

/update







2nd file:



update

  add

  doc

field name=id12/field

field name=COLOR_s update=setGreen/field

field name=SIZE_sXL/field

field name=_root_ 10/field

  /doc

/doc

  /add

  commit/

/update
update
  deletequery*:*/query/delete
  add
doc
  field name=id10/field
  field name=type_sparent/field
  field name=BRAND_sNike/field
  doc
field name=id11/field
field name=COLOR_sRed/field
field name=SIZE_sXL/field
  /doc
  doc
field name=id12/field
field name=COLOR_sBlue/field
field name=SIZE_sXL/field
  /doc
/doc
  /add
  commit/
/updateupdate  
  add
  doc
field name=id12/field
field name=COLOR_s update=setGreen/field
field name=SIZE_sXL/field
		field name=_root_ 10/field
  /doc
/doc
  /add
  commit/
/update

Re: Problems using solr.SpatialRecursivePrefixTreeFieldType

2014-03-16 Thread Lajos

Hi Hamish,

Are you running Jetty?

In Tomcat, I've put jts-1.13.jar in the WEB-INF/lib directory of the 
unpacked distribution and restarted. It worked fine.


Maybe check file permissions as well ...

Regards,

Lajos



On 16/03/2014 10:18, Hamish Campbell wrote:

Hey all,

Trying to use SpatialRecursivePrefixTreeFieldType to store extent polygons
but I can't seem to get it configured correctly. Hitting this on start up:

4670 [main] ERROR org.apache.solr.core.SolrCore  - Error loading

core:java.util.concurrent.ExecutionException:
java.lang.NoClassDefFoundError:
com/vividsolutions/jts/geom/CoordinateSequenceFactory



Per the manual, and David Smiley's previous responses, I've added the jst
.jar files to WEB-INF/lib in the solr .war file.

I'm still getting the error above, any other clues?



Re: Nested documents, block join - re-indexing a single document upon update

2014-03-16 Thread Jack Krupansky
You stumbled upon the whole point of block join – that the documents are and 
must be managed as a block and not individually.

-- Jack Krupansky

From: danny teichthal 
Sent: Sunday, March 16, 2014 6:47 AM
To: solr-user@lucene.apache.org 
Subject: Nested documents, block join - re-indexing a single document upon 
update




Hi All,




To make things short, I would like to use block joins, but to be able to index 
each document on the block separately.

Is it possible?



In more details:



We have some nested parent-child structure where:

1.   Parent may have a single level of children

2.   Parent and child documents may be updated independently.

3.   We may want to search for parent by child info and vise versa.



At first we thought of managing the parent and child in different documents, 
de-normalizing child data at parent level and parent data on child level.

After reading Mikhail blog 
http://blog.griddynamics.com/2013/09/solr-block-join-support.html, we thought 
of using the block join for this purpose.



But, I got into a wall when trying to update a single child document.

For me, it’s ok if SOLR will internally index the whole block, I just don’t 
want to fetch the whole hierarchy from DB for update.



I was trying to achieve this using atomic updates – since all the fields must 
be stored anyway – if I send an atomic update on one of the children with the 
_root_ field then there’s no need to send the whole hierarchy.

But, when I try this method, I see that the child document is indeed updated, 
but it’s order is changed to be after the parent.



This is what I did:

1.   Change the root field to be stored - field name=_root_ 
type=string indexed=true stored=true/

2.   Put attached docs on example\exampledocs.

3.   Run post.jar on parent-child.xml

4.   Run post.jar on update-child-atomic.xml.

5.   Now - 
http://localhost:8983/solr/collection1/select?q=%7B!parent+which%3D%27type_s%3Aparent%27%7D%2BCOLOR_s%3ARed+%2BSIZE_s%3AXL%0Awt=jsonindent=true,
 returns parent 10 as expected.

6.   But,  
http://localhost:8983/solr/collection1/select?q={!parent+which%3D'type_s%3Aparent'}%2BCOLOR_s%3AGreen+%2BSIZE_s%3AXL%0Awt=jsonindent=true
 – returns nothing.

7.   When searching *:* on Admin,   record with id=12 was updated with 
‘Green’, but it is returned below the parent record.

8.



Thanks in advance.



In case the attachments does not work:

1st file to post:



update

  deletequery*:*/query/delete

  add

doc

  field name=id10/field

  field name=type_sparent/field

  field name=BRAND_sNike/field

  doc

field name=id11/field

field name=COLOR_sRed/field

field name=SIZE_sXL/field

  /doc

  doc

field name=id12/field

field name=COLOR_sBlue/field

field name=SIZE_sXL/field

  /doc

/doc

  /add

  commit/

/update







2nd file:



update  

  add

  doc

field name=id12/field

field name=COLOR_s update=setGreen/field

field name=SIZE_sXL/field

field name=_root_ 10/field

  /doc

/doc

  /add

  commit/

/update



Re: example schema now stores most field values

2014-03-16 Thread Michael Sokolov
Thanks for hunting that down, Jack.  It may very well have been a change 
that we made (to remove the stored=true.  Sorry if I led you on a wild 
goose chase.


Actually I wonder if people have a better method for performing config 
upgrades than mine.  I find that every time I take a Solr upgrade, I 
need to do a point-by-point comparison of the new schema (and 
solrconfiig) with my current one in order to upgrade the configuration 
while keeping my custom fields, etc. plus there are all kinds of fields 
in the sample schema that I don't really need: sku, manu, etc. which I 
typically remove just to cut down on clutter.  Does anyone have a script 
that uses svn to do this in a clever way that they'd like to share?


-Mike

On 3/15/2014 4:52 PM, Jack Krupansky wrote:
Could it be that you had dropped a pre-4.2.1 schema into 4.2.1? I 
mean, I just exhaustively examined all schema.xml changes between 
4.2.1 and 4.6.1 (all 6 of them) and saw no wholesale change to 
stored=true. Maybe somebody on your end removed a lot of fields from 
the 4.2.1 release of schema.xml.


Could you give a couple of examples of fields that changed from your 
4.2.1 schema?


You can examine all changes yourself here:

http://svn.apache.org/viewvc/lucene/dev/tags/lucene_solr_4_2_1/solr/example/solr/collection1/conf/schema.xml?view=log 

http://svn.apache.org/viewvc/lucene/dev/tags/lucene_solr_4_6_1/solr/example/solr/collection1/conf/schema.xml?view=log 



-- Jack Krupansky

-Original Message- From: Michael Sokolov
Sent: Saturday, March 15, 2014 1:02 PM
To: solr-user@lucene.apache.org
Subject: example schema now stores most field values

While upgrading from 4.2.1 to 4.6.1 I noticed that many of the fields
defined in the example schema.xml that used to be indexed and not stored
are now defined as indexed and stored.  Is there anything behind this
change other than the idea that it would be more convenient to have all
the values available?  Is it somehow cheaper to recover them from the
index now, so that storing (ints, say) is free?

-Mike




Re: Result merging takes too long

2014-03-16 Thread Shalin Shekhar Mangar
On Thu, Mar 13, 2014 at 1:49 PM, remi tassing tassingr...@gmail.com wrote:

 I've used the fl=id parameter to avoid retrieving the actual documents
 (step 4 in your mail) but the problem still exists.
 Any ideas on how to find the merging time(step 3)?

Actually that doesn't skip steps #4 and #5. That optimization will be
available in Solr 4.8 (the next release). See
https://issues.apache.org/jira/browse/SOLR-1880

-- 
Regards,
Shalin Shekhar Mangar.


Re: CollapsingQParserPlugin facet results: fq={!collapse field=fld} vs. group=truegroup.field=fld

2014-03-16 Thread Joel Bernstein
Tim,

When using the CollapsingQParserPlugin facet counts will be calculated for
the collapsed result set. With grouping you need to specify group.truncate
to calculate facet counts on the collapsed result set.

You can use the tag/exclude facet options to remove the
CollapsingQParserPlugin filter query for specific facets if you want to see
facet counts for the un-collapsed result set.

Joel

Joel Bernstein
Search Engineer at Heliosearch


On Tue, Mar 11, 2014 at 3:05 PM, tchaffee tchaf...@livingnaturally.comwrote:

 Should the same exact query using fq={!collapse field=fld} return the same
 results as  group=truegroup.field=fld  ???

 I am getting different results for my facets on those queries when I have a
 second fq=

 This happens in both

 4.6.0 1543363 - simon - 2013-11-19 11:16:33

 and

 4.8-2014-02-23_07-35-56 1570983 - hudson - 2014-02-23 07:46:13

 I can post actual queries and results but I am not seeing anything about
 how
 CollapsingQParserPlugin or grouping works in the debug.



 Thanks,
 Tim



 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/CollapsingQParserPlugin-facet-results-fq-collapse-field-fld-vs-group-true-group-field-fld-tp4122952.html
 Sent from the Solr - User mailing list archive at Nabble.com.



bulk indexing - EofExceptions and big latencies after soft-commit

2014-03-16 Thread adfel70
Hi
I have a 12-node solr 4.6.1 cluster. each node has 2 solr procceses, running
on 8gb heap jvms. each node has total of 64gb memory.
My current collection (7 shards, 3 replicas) has around 500 million docs. 
I'm performing bulk indexing into the collection. I set softCommit to 10
minutes and hardCommit openSearcher=false to 15 minutes.

I recently started seeing the following problems while indexing - every 10
minutes ( and I assume that this is the 10minutes soft-commit cycles) I get
the following errors:
1. EofExcpetion from jetty in HttpOutput.write send from SolrDispatchFilter
2. queries to all cores start getting high latencies (more the 10 seconds)

Any idea?

thanks.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/bulk-indexing-EofExceptions-and-big-latencies-after-soft-commit-tp4124574.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Result merging takes too long

2014-03-16 Thread Chris W
Can you share a sample query ? Ensure you have filterquery, fl fields and
query result cache settings well tuned

To give you an example: A month ago I had an issue where a few of our
queries were taking 3+seconds with 5 shards and as I added more shards the
query was taking even longer. I figured that the root cause was a poorly
written query (improper use of q and failure to use Fq for filtering
fields). That brought down latencies to sub 200 ms. More sharding is
usually better performant (atleast with respect to my experience so far)
and merge/fetch time has been negligible compared to search time
(especially if the search doesnt use filters and caches)


To help you more I need sample queries




On Sun, Mar 16, 2014 at 7:31 AM, Shalin Shekhar Mangar 
shalinman...@gmail.com wrote:

 On Thu, Mar 13, 2014 at 1:49 PM, remi tassing tassingr...@gmail.com
 wrote:
 
  I've used the fl=id parameter to avoid retrieving the actual documents
  (step 4 in your mail) but the problem still exists.
  Any ideas on how to find the merging time(step 3)?

 Actually that doesn't skip steps #4 and #5. That optimization will be
 available in Solr 4.8 (the next release). See
 https://issues.apache.org/jira/browse/SOLR-1880

 --
 Regards,
 Shalin Shekhar Mangar.




-- 
Best
-- 
C


Re: bulk indexing - EofExceptions and big latencies after soft-commit

2014-03-16 Thread Shawn Heisey
On 3/16/2014 10:34 AM, adfel70 wrote:
 I have a 12-node solr 4.6.1 cluster. each node has 2 solr procceses, running
 on 8gb heap jvms. each node has total of 64gb memory.
 My current collection (7 shards, 3 replicas) has around 500 million docs. 
 I'm performing bulk indexing into the collection. I set softCommit to 10
 minutes and hardCommit openSearcher=false to 15 minutes.

How much index data does each server have on it?  This would be the sum
total of the index directories of all your cores.

 I recently started seeing the following problems while indexing - every 10
 minutes ( and I assume that this is the 10minutes soft-commit cycles) I get
 the following errors:
 1. EofExcpetion from jetty in HttpOutput.write send from SolrDispatchFilter
 2. queries to all cores start getting high latencies (more the 10 seconds)

EofException errors happen when your client disconnects before the
request is complete.  I would strongly recommend that you *NOT*
configure hard timeouts for your client connections, or that you make
them really long, five minutes or so.  For SolrJ, this is the SO_TIMEOUT.

These problems sound like one of two things.  It could be either or both:

1) You don't have enough RAM to cache your index effectively.  With 64GB
of RAM and 16GB heap, you have approximately 48GB of RAM left over for
other software and the OS disk cache.  If the total index size on each
machine is in the neighborhood of 60GB (or larger), this might be a
problem.  If you have software other than Solr running on the machine,
you must subtract it's direct and indirect memory requirements from the
available OS disk cache.

2) Indexing results in a LOT of object creation, most of which exist for
a relatively short time.  This can result in severe problems with
garbage collection pauses.

Both problems listed above (and a few others) are discussed at the wiki
page linked below.  As you will read, there are two major causes of GC
symptoms - a heap that's too small and incorrect (or nonexistent) GC
tuning.  With a very large index like yours, either or both of these GC
symptoms could be happening.

http://wiki.apache.org/solr/SolrPerformanceProblemshttp://wiki.apache.org/solr/SolrPerformanceProblems

Side note: You should only be running one Solr process per machine.
Running multiple processes creates additional memory overhead.  Any hard
limits that you might have run into with a single Solr process can be
overcome with configuration options for Jetty, Solr, or the operating
system.

Thanks,
Shawn



Re: Problems using solr.SpatialRecursivePrefixTreeFieldType

2014-03-16 Thread Hamish Campbell
Hi Lajos,

Running Jetty - I've checked file permissions, they're definitely correct.
Against the advice of the wiki, I've tried adding them to the solr contrib
folder with


On Mon, Mar 17, 2014 at 2:22 AM, Lajos la...@protulae.com wrote:

 Hi Hamish,

 Are you running Jetty?

 In Tomcat, I've put jts-1.13.jar in the WEB-INF/lib directory of the
 unpacked distribution and restarted. It worked fine.

 Maybe check file permissions as well ...

 Regards,

 Lajos




 On 16/03/2014 10:18, Hamish Campbell wrote:

 Hey all,

 Trying to use SpatialRecursivePrefixTreeFieldType to store extent
 polygons
 but I can't seem to get it configured correctly. Hitting this on start up:

 4670 [main] ERROR org.apache.solr.core.SolrCore  - Error loading

 core:java.util.concurrent.ExecutionException:
 java.lang.NoClassDefFoundError:
 com/vividsolutions/jts/geom/CoordinateSequenceFactory



 Per the manual, and David Smiley's previous responses, I've added the jst
 .jar files to WEB-INF/lib in the solr .war file.

 I'm still getting the error above, any other clues?




-- 
Hamish Campbell
Koordinates Ltd http://koordinates.com/?_bzhc=esig
PH   +64 9 966 0433
FAX +64 9 966 0045


Re: Problems using solr.SpatialRecursivePrefixTreeFieldType

2014-03-16 Thread Hamish Campbell
(apologies, gmail sent that a bit early.. continued below!)

Hi Lajos,

Running Jetty - I've checked file permissions, they're definitely correct.
Against the advice of the wiki, I've tried adding them to the solr contrib
folder with a lib entry in solrconfig. The log then shows it as loaded, but
the error persists.

So jtso (1.13) is definitely in the WEB-INF folder and has the correct
permissions. The only thing I can think of is that maybe we've got a
slightly unusual file structure. Is jtso loaded relative to a particular
folder? Everything else is running as expected.


On Mon, Mar 17, 2014 at 9:13 AM, Hamish Campbell 
hamish.campb...@koordinates.com wrote:

 Hi Lajos,

 Running Jetty - I've checked file permissions, they're definitely correct.
 Against the advice of the wiki, I've tried adding them to the solr contrib
 folder with


 On Mon, Mar 17, 2014 at 2:22 AM, Lajos la...@protulae.com wrote:

 Hi Hamish,

 Are you running Jetty?

 In Tomcat, I've put jts-1.13.jar in the WEB-INF/lib directory of the
 unpacked distribution and restarted. It worked fine.

 Maybe check file permissions as well ...

 Regards,

 Lajos




 On 16/03/2014 10:18, Hamish Campbell wrote:

 Hey all,

 Trying to use SpatialRecursivePrefixTreeFieldType to store extent
 polygons
 but I can't seem to get it configured correctly. Hitting this on start
 up:

 4670 [main] ERROR org.apache.solr.core.SolrCore  - Error loading

 core:java.util.concurrent.ExecutionException:
 java.lang.NoClassDefFoundError:
 com/vividsolutions/jts/geom/CoordinateSequenceFactory



 Per the manual, and David Smiley's previous responses, I've added the jst
 .jar files to WEB-INF/lib in the solr .war file.

 I'm still getting the error above, any other clues?




 --
 Hamish Campbell
 Koordinates Ltd http://koordinates.com/?_bzhc=esig
 PH   +64 9 966 0433
 FAX +64 9 966 0045




-- 
Hamish Campbell
Koordinates Ltd http://koordinates.com/?_bzhc=esig
PH   +64 9 966 0433
FAX +64 9 966 0045


Re: [solr 4.7.0] analysis page: issue with HTMLStripCharFilterFactory

2014-03-16 Thread Stefan Matheis
Hey Dmitry 

We had a similar issue reported and already fixed: 
https://issues.apache.org/jira/browse/SOLR-5800
i'd suspect that this patch fixes your issue too? would like to hear back from 
you, if that's the case :)

-Stefan 


On Saturday, March 15, 2014 at 6:58 PM, Dmitry Kan wrote:

 Hello,
 
 The following type does not get analyzed properly on the solr 4.7.0
 analysis page:
 
 fieldType name=text_en_splitting class=solr.TextField
 positionIncrementGap=100 autoGeneratePhraseQueries=true
 analyzer type=index
 charFilter class=solr.HTMLStripCharFilterFactory/
 !-- tokenizer class=solr.WhitespaceTokenizerFactory/ --
 tokenizer class=solr.StandardTokenizerFactory /
 filter class=solr.StopFilterFactory
 ignoreCase=true
 words=lang/stopwords_en.txt
 /
 filter class=solr.WordDelimiterFilterFactory
 generateWordParts=1 generateNumberParts=1 catenateWords=1
 catenateNumbers=1 catenateAll=0 splitOnCaseChange=1/
 filter class=solr.LowerCaseFilterFactory/
 filter class=solr.KeywordMarkerFilterFactory
 protected=protwords.txt/
 filter class=solr.PorterStemFilterFactory/
 /analyzer
 analyzer type=query
 tokenizer class=solr.StandardTokenizerFactory /
 filter class=solr.SynonymFilterFactory synonyms=synonyms.txt
 ignoreCase=true expand=true/
 filter class=solr.StopFilterFactory
 ignoreCase=true
 words=lang/stopwords_en.txt
 /
 filter class=solr.WordDelimiterFilterFactory
 generateWordParts=1 generateNumberParts=1 catenateWords=0
 catenateNumbers=0 catenateAll=0 splitOnCaseChange=1/
 filter class=solr.LowerCaseFilterFactory/
 filter class=solr.KeywordMarkerFilterFactory
 protected=protwords.txt/
 filter class=solr.PorterStemFilterFactory/
 /analyzer
 /fieldType
 
 Example text:
 fox jumps
 
 Screenshot:
 http://pbrd.co/1lEVEIa
 
 This works fine in solr 4.6.1.
 
 -- 
 Dmitry
 Blog: http://dmitrykan.blogspot.com
 Twitter: http://twitter.com/dmitrykan
 
 




Re: [solr 4.7.0] analysis page: issue with HTMLStripCharFilterFactory

2014-03-16 Thread Stefan Matheis
Oh .. i'm sorry .. late to the party - didn't see the response from Doug .. so 
feel free to ignore that mail (: 


On Sunday, March 16, 2014 at 9:38 PM, Stefan Matheis wrote:

 Hey Dmitry 
 
 We had a similar issue reported and already fixed: 
 https://issues.apache.org/jira/browse/SOLR-5800
 i'd suspect that this patch fixes your issue too? would like to hear back 
 from you, if that's the case :)
 
 -Stefan 
 
 On Saturday, March 15, 2014 at 6:58 PM, Dmitry Kan wrote:
 
  Hello,
  
  The following type does not get analyzed properly on the solr 4.7.0
  analysis page:
  
  fieldType name=text_en_splitting class=solr.TextField
  positionIncrementGap=100 autoGeneratePhraseQueries=true
  analyzer type=index
  charFilter class=solr.HTMLStripCharFilterFactory/
  !-- tokenizer class=solr.WhitespaceTokenizerFactory/ --
  tokenizer class=solr.StandardTokenizerFactory /
  filter class=solr.StopFilterFactory
  ignoreCase=true
  words=lang/stopwords_en.txt
  /
  filter class=solr.WordDelimiterFilterFactory
  generateWordParts=1 generateNumberParts=1 catenateWords=1
  catenateNumbers=1 catenateAll=0 splitOnCaseChange=1/
  filter class=solr.LowerCaseFilterFactory/
  filter class=solr.KeywordMarkerFilterFactory
  protected=protwords.txt/
  filter class=solr.PorterStemFilterFactory/
  /analyzer
  analyzer type=query
  tokenizer class=solr.StandardTokenizerFactory /
  filter class=solr.SynonymFilterFactory synonyms=synonyms.txt
  ignoreCase=true expand=true/
  filter class=solr.StopFilterFactory
  ignoreCase=true
  words=lang/stopwords_en.txt
  /
  filter class=solr.WordDelimiterFilterFactory
  generateWordParts=1 generateNumberParts=1 catenateWords=0
  catenateNumbers=0 catenateAll=0 splitOnCaseChange=1/
  filter class=solr.LowerCaseFilterFactory/
  filter class=solr.KeywordMarkerFilterFactory
  protected=protwords.txt/
  filter class=solr.PorterStemFilterFactory/
  /analyzer
  /fieldType
  
  Example text:
  fox jumps
  
  Screenshot:
  http://pbrd.co/1lEVEIa
  
  This works fine in solr 4.6.1.
  
  -- 
  Dmitry
  Blog: http://dmitrykan.blogspot.com
  Twitter: http://twitter.com/dmitrykan
  
  
  
 
 



Re: Solr Japanese support

2014-03-16 Thread Benson Margulies
Your problem has nothing to do with Japanese. Perhaps a content-type
for CSV would work better?

On Sat, Mar 15, 2014 at 12:50 PM, Bala Iyer grb...@yahoo.com wrote:
 Hi,

 I am new to Solr japanese.
 I added the support for japanese on schema.xml
 How can i insert Japanese text into that field either by solr client (java / 
 php / ruby ) or by curl


 schema.xml
 
 field name=username type=string indexed=true stored=true 
 multiValued=true omitNorms=true termVectors=true /
 field name=timestamp type=date indexed=true stored=true 
 multiValued=true omitNorms=true termVectors=true /
 field name=jtxt type=text_ja indexed=true stored=true 
 multiValued=true omitNorms=true termVectors=true /

 fieldType name=text_ja class=solr.TextField 
 positionIncrementGap=100 autoGeneratePhraseQueries=false
   analyzer
 tokenizer class=solr.JapaneseTokenizerFactory mode=search/

 !--tokenizer class=solr.JapaneseTokenizerFactory mode=search 
 userDictionary=lang/userdict_ja.txt/--
 !-- Reduces inflected verbs and adjectives to their base/dictionary 
 forms (辞書形) --
 filter class=solr.JapaneseBaseFormFilterFactory/
 !-- Removes tokens with certain part-of-speech tags --
 filter class=solr.JapanesePartOfSpeechStopFilterFactory 
 tags=lang/stoptags_ja.txt /
 !-- Normalizes full-width romaji to half-width and half-width kana 
 to full-width (Unicode NFKC subset) --
 filter class=solr.CJKWidthFilterFactory/
 !-- Removes common tokens typically not useful for search, but have 
 a negative effect on ranking --
 filter class=solr.StopFilterFactory ignoreCase=true 
 words=lang/stopwords_ja.txt /
 !-- Normalizes common katakana spelling variations by removing any 
 last long sound character (U+30FC) --
 filter class=solr.JapaneseKatakanaStemFilterFactory 
 minimumLength=4/
 !-- Lower-cases romaji characters --
 filter class=solr.LowerCaseFilterFactory/
   /analyzer
 /fieldType
 

 my insert.csv file

 id,username,timestamp,content,jtxt
 9,x,2013-12-26T10:14:26Z,Hello ,マイ ドキュメント
 =
 I am trying to insert through curl it gives me error
 curl 
 http://localhost:8983/solr/collection1/update/csv?separator=,commit=true; 
 -H Content-Type: text/plain; charset=utf-8 --data-binary @insert.csv


 ERROR
 
 ?xml version=1.0 encoding=UTF-8?
 response
 lst name=responseHeaderint name=status400/intint 
 name=QTime23/int
/lstlst name=errorstr name=msgDocument is missing mandatory 
uniqueKey
  field: id/strint name=code400/int/lst
 /response

 I know i should not use Content-Type as text/plain
 =


 Thanks


Querying specific database attributes or table

2014-03-16 Thread Olivier Austina
Hi,
I am new to Solr.

I would like to index and querying a relational database. Is it possible to
query a specific table or attribute of the database. Example if I have 2
tables A and B both have the attribute name and I want to have only the
results form the table A and not from table B. Is it possible?
Can I restrict the query to only one table without having result from
others table?
Is it possible to query a specific attribute of a table?
Is it possible to do join query like SQL?
Any suggestion is welcome. Thank you.

Regards
Olivier


Re: Querying specific database attributes or table

2014-03-16 Thread Ahmet Arslan
Hi Oliver,

If you index a field (say source) holding table name, you can filter by that 
field.
e.q. fq=source:TableA    http://wiki.apache.org/solr/CommonQueryParameters#fq

It is possible to query a specific attribute of a table. This is usually 
corresponds to fields in solr.

See https://cwiki.apache.org/confluence/display/solr/Other+Parsers#OtherParsers 
for different type of joins.

Ahmet




On Monday, March 17, 2014 12:44 AM, Olivier Austina olivier.aust...@gmail.com 
wrote:
Hi,
I am new to Solr.

I would like to index and querying a relational database. Is it possible to
query a specific table or attribute of the database. Example if I have 2
tables A and B both have the attribute name and I want to have only the
results form the table A and not from table B. Is it possible?
Can I restrict the query to only one table without having result from
others table?
Is it possible to query a specific attribute of a table?
Is it possible to do join query like SQL?
Any suggestion is welcome. Thank you.

Regards
Olivier



Re: Solr Japanese support

2014-03-16 Thread Alexandre Rafalovitch
Which version of Solr are you on? Because for Solr 4, the endpoint
should be /update and the Content-Type should be correct. See:
http://wiki.apache.org/solr/UpdateCSV

I would expect the problem NOT to be around Japanese, but around other
things. You could for example try to index Japanese into the example
collection that comes with Solr. That way you got other variables all
correct. Then, you add another field+fieldType and see if it still
works.

Regards,
   Alex.
Personal website: http://www.outerthoughts.com/
LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch
- Time is the quality of nature that keeps events from happening all
at once. Lately, it doesn't seem to be working.  (Anonymous  - via GTD
book)


On Sat, Mar 15, 2014 at 11:50 PM, Bala Iyer grb...@yahoo.com wrote:
 Hi,

 I am new to Solr japanese.
 I added the support for japanese on schema.xml
 How can i insert Japanese text into that field either by solr client (java / 
 php / ruby ) or by curl


 schema.xml
 
 field name=username type=string indexed=true stored=true 
 multiValued=true omitNorms=true termVectors=true /
 field name=timestamp type=date indexed=true stored=true 
 multiValued=true omitNorms=true termVectors=true /
 field name=jtxt type=text_ja indexed=true stored=true 
 multiValued=true omitNorms=true termVectors=true /

 fieldType name=text_ja class=solr.TextField 
 positionIncrementGap=100 autoGeneratePhraseQueries=false
   analyzer
 tokenizer class=solr.JapaneseTokenizerFactory mode=search/

 !--tokenizer class=solr.JapaneseTokenizerFactory mode=search 
 userDictionary=lang/userdict_ja.txt/--
 !-- Reduces inflected verbs and adjectives to their base/dictionary 
 forms (辞書形) --
 filter class=solr.JapaneseBaseFormFilterFactory/
 !-- Removes tokens with certain part-of-speech tags --
 filter class=solr.JapanesePartOfSpeechStopFilterFactory 
 tags=lang/stoptags_ja.txt /
 !-- Normalizes full-width romaji to half-width and half-width kana 
 to full-width (Unicode NFKC subset) --
 filter class=solr.CJKWidthFilterFactory/
 !-- Removes common tokens typically not useful for search, but have 
 a negative effect on ranking --
 filter class=solr.StopFilterFactory ignoreCase=true 
 words=lang/stopwords_ja.txt /
 !-- Normalizes common katakana spelling variations by removing any 
 last long sound character (U+30FC) --
 filter class=solr.JapaneseKatakanaStemFilterFactory 
 minimumLength=4/
 !-- Lower-cases romaji characters --
 filter class=solr.LowerCaseFilterFactory/
   /analyzer
 /fieldType
 

 my insert.csv file

 id,username,timestamp,content,jtxt
 9,x,2013-12-26T10:14:26Z,Hello ,マイ ドキュメント
 =
 I am trying to insert through curl it gives me error
 curl 
 http://localhost:8983/solr/collection1/update/csv?separator=,commit=true; 
 -H Content-Type: text/plain; charset=utf-8 --data-binary @insert.csv


 ERROR
 
 ?xml version=1.0 encoding=UTF-8?
 response
 lst name=responseHeaderint name=status400/intint 
 name=QTime23/int
/lstlst name=errorstr name=msgDocument is missing mandatory 
uniqueKey
  field: id/strint name=code400/int/lst
 /response

 I know i should not use Content-Type as text/plain
 =


 Thanks


Re: Solr Japanese support

2014-03-16 Thread Tri Dang
Please unsubscribe me.





On Sunday, March 16, 2014 9:10 PM, Alexandre Rafalovitch arafa...@gmail.com 
wrote:
 
Which version of Solr are you on? Because for Solr 4, the endpoint
should be /update and the Content-Type should be correct. See:
http://wiki.apache.org/solr/UpdateCSV

I would expect the problem NOT to be around Japanese, but around other
things. You could for example try to index Japanese into the example
collection that comes with Solr. That way you got other variables all
correct. Then, you add another field+fieldType and see if it still
works.

Regards,
   Alex.
Personal website: http://www.outerthoughts.com/
LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch
- Time is the quality of nature that keeps events from happening all
at once. Lately, it doesn't seem to be working.  (Anonymous  - via GTD
book)



On Sat, Mar 15, 2014 at 11:50 PM, Bala Iyer grb...@yahoo.com wrote:
 Hi,

 I am new to Solr japanese.
 I added the support for japanese on schema.xml
 How can i insert Japanese text into that field either by solr client (java / 
 php / ruby ) or by curl


 schema.xml
 
     field name=username type=string indexed=true stored=true 
multiValued=true omitNorms=true termVectors=true /
     field name=timestamp type=date indexed=true stored=true 
multiValued=true omitNorms=true termVectors=true /
     field name=jtxt type=text_ja indexed=true stored=true 
multiValued=true omitNorms=true termVectors=true /

     fieldType name=text_ja class=solr.TextField 
positionIncrementGap=100 autoGeneratePhraseQueries=false
       analyzer
         tokenizer class=solr.JapaneseTokenizerFactory mode=search/

         !--tokenizer class=solr.JapaneseTokenizerFactory mode=search 
userDictionary=lang/userdict_ja.txt/--
         !-- Reduces inflected verbs and adjectives to their base/dictionary 
forms (辞書形) --
         filter class=solr.JapaneseBaseFormFilterFactory/
         !-- Removes tokens with certain part-of-speech tags --
         filter class=solr.JapanesePartOfSpeechStopFilterFactory 
tags=lang/stoptags_ja.txt /
         !-- Normalizes full-width romaji to half-width and half-width kana 
to full-width (Unicode NFKC subset) --
         filter class=solr.CJKWidthFilterFactory/
         !-- Removes common tokens typically not useful for search, but have 
a negative effect on ranking --
         filter class=solr.StopFilterFactory ignoreCase=true 
words=lang/stopwords_ja.txt /
         !-- Normalizes common katakana spelling variations by removing any 
last long sound character (U+30FC) --
         filter class=solr.JapaneseKatakanaStemFilterFactory 
minimumLength=4/
         !-- Lower-cases romaji characters --
         filter class=solr.LowerCaseFilterFactory/
       /analyzer
     /fieldType
 

 my insert.csv file

 id,username,timestamp,content,jtxt
 9,x,2013-12-26T10:14:26Z,Hello ,マイ ドキュメント
 =
 I am trying to insert through curl it gives me error
 curl 
 http://localhost:8983/solr/collection1/update/csv?separator=,commit=true; 
 -H Content-Type: text/plain; charset=utf-8 --data-binary @insert.csv


 ERROR
 
 ?xml version=1.0 encoding=UTF-8?
 response
 lst name=responseHeaderint name=status400/intint 
 name=QTime23/int
/lstlst name=errorstr name=msgDocument is missing mandatory 
uniqueKey
  field: id/strint name=code400/int/lst
 /response

 I know i should not use Content-Type as text/plain
 =


 Thanks

Re: example schema now stores most field values

2014-03-16 Thread Alexandre Rafalovitch
What do you expect to get out of updates except for
stability/new-features? In schema.xml, there is a version field. If
you don't change that, you should not be affected by new defaults. As
to new features (e.g. Near-Real-Time), you have to enable it
explicitly in couple of places to get it going.

I don't think restarting every upgrade from example schema is a good
idea. It has too much stuff in it. Perhaps, what we need is a separate
minimal example that would show just the basics. But then, what would
go into that (NRT support? dynamic schema?).

Regards,
   Alex.
P.s. I am not disagreeing that the migration assistance would be
awesome. I am just trying to draw out more details from those doing it
in the trenches and make the discussion more concrete.

Personal website: http://www.outerthoughts.com/
LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch
- Time is the quality of nature that keeps events from happening all
at once. Lately, it doesn't seem to be working.  (Anonymous  - via GTD
book)


On Sun, Mar 16, 2014 at 9:28 PM, Michael Sokolov
msoko...@safaribooksonline.com wrote:
 Thanks for hunting that down, Jack.  It may very well have been a change
 that we made (to remove the stored=true.  Sorry if I led you on a wild
 goose chase.

 Actually I wonder if people have a better method for performing config
 upgrades than mine.  I find that every time I take a Solr upgrade, I need to
 do a point-by-point comparison of the new schema (and solrconfiig) with my
 current one in order to upgrade the configuration while keeping my custom
 fields, etc. plus there are all kinds of fields in the sample schema that I
 don't really need: sku, manu, etc. which I typically remove just to cut down
 on clutter.  Does anyone have a script that uses svn to do this in a clever
 way that they'd like to share?

 -Mike


 On 3/15/2014 4:52 PM, Jack Krupansky wrote:

 Could it be that you had dropped a pre-4.2.1 schema into 4.2.1? I mean, I
 just exhaustively examined all schema.xml changes between 4.2.1 and 4.6.1
 (all 6 of them) and saw no wholesale change to stored=true. Maybe somebody
 on your end removed a lot of fields from the 4.2.1 release of schema.xml.

 Could you give a couple of examples of fields that changed from your 4.2.1
 schema?

 You can examine all changes yourself here:


 http://svn.apache.org/viewvc/lucene/dev/tags/lucene_solr_4_2_1/solr/example/solr/collection1/conf/schema.xml?view=log

 http://svn.apache.org/viewvc/lucene/dev/tags/lucene_solr_4_6_1/solr/example/solr/collection1/conf/schema.xml?view=log

 -- Jack Krupansky

 -Original Message- From: Michael Sokolov
 Sent: Saturday, March 15, 2014 1:02 PM
 To: solr-user@lucene.apache.org
 Subject: example schema now stores most field values

 While upgrading from 4.2.1 to 4.6.1 I noticed that many of the fields
 defined in the example schema.xml that used to be indexed and not stored
 are now defined as indexed and stored.  Is there anything behind this
 change other than the idea that it would be more convenient to have all
 the values available?  Is it somehow cheaper to recover them from the
 index now, so that storing (ints, say) is free?

 -Mike




Re: Solr Japanese support

2014-03-16 Thread Erick Erickson
Tri Dang:

Please follow the instructions here:

https://lucene.apache.org/solr/discussion.html

Best,
Erick

On Sun, Mar 16, 2014 at 6:15 PM, Tri Dang tritd...@yahoo.com wrote:
 Please unsubscribe me.





 On Sunday, March 16, 2014 9:10 PM, Alexandre Rafalovitch arafa...@gmail.com 
 wrote:

 Which version of Solr are you on? Because for Solr 4, the endpoint
 should be /update and the Content-Type should be correct. See:
 http://wiki.apache.org/solr/UpdateCSV

 I would expect the problem NOT to be around Japanese, but around other
 things. You could for example try to index Japanese into the example
 collection that comes with Solr. That way you got other variables all
 correct. Then, you add another field+fieldType and see if it still
 works.

 Regards,
Alex.
 Personal website: http://www.outerthoughts.com/
 LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch
 - Time is the quality of nature that keeps events from happening all
 at once. Lately, it doesn't seem to be working.  (Anonymous  - via GTD
 book)



 On Sat, Mar 15, 2014 at 11:50 PM, Bala Iyer grb...@yahoo.com wrote:
 Hi,

 I am new to Solr japanese.
 I added the support for japanese on schema.xml
 How can i insert Japanese text into that field either by solr client (java / 
 php / ruby ) or by curl


 schema.xml
 
 field name=username type=string indexed=true stored=true 
 multiValued=true omitNorms=true termVectors=true /
 field name=timestamp type=date indexed=true stored=true 
 multiValued=true omitNorms=true termVectors=true /
 field name=jtxt type=text_ja indexed=true stored=true 
 multiValued=true omitNorms=true termVectors=true /

 fieldType name=text_ja class=solr.TextField 
 positionIncrementGap=100 autoGeneratePhraseQueries=false
   analyzer
 tokenizer class=solr.JapaneseTokenizerFactory mode=search/

 !--tokenizer class=solr.JapaneseTokenizerFactory mode=search 
 userDictionary=lang/userdict_ja.txt/--
 !-- Reduces inflected verbs and adjectives to their base/dictionary 
 forms (辞書形) --
 filter class=solr.JapaneseBaseFormFilterFactory/
 !-- Removes tokens with certain part-of-speech tags --
 filter class=solr.JapanesePartOfSpeechStopFilterFactory 
 tags=lang/stoptags_ja.txt /
 !-- Normalizes full-width romaji to half-width and half-width kana 
 to full-width (Unicode NFKC subset) --
 filter class=solr.CJKWidthFilterFactory/
 !-- Removes common tokens typically not useful for search, but have 
 a negative effect on ranking --
 filter class=solr.StopFilterFactory ignoreCase=true 
 words=lang/stopwords_ja.txt /
 !-- Normalizes common katakana spelling variations by removing any 
 last long sound character (U+30FC) --
 filter class=solr.JapaneseKatakanaStemFilterFactory 
 minimumLength=4/
 !-- Lower-cases romaji characters --
 filter class=solr.LowerCaseFilterFactory/
   /analyzer
 /fieldType
 

 my insert.csv file

 id,username,timestamp,content,jtxt
 9,x,2013-12-26T10:14:26Z,Hello ,マイ ドキュメント
 =
 I am trying to insert through curl it gives me error
 curl 
 http://localhost:8983/solr/collection1/update/csv?separator=,commit=true; 
 -H Content-Type: text/plain; charset=utf-8 --data-binary @insert.csv


 ERROR
 
 ?xml version=1.0 encoding=UTF-8?
 response
 lst name=responseHeaderint name=status400/intint 
 name=QTime23/int
/lstlst name=errorstr name=msgDocument is missing mandatory 
uniqueKey
  field: id/strint name=code400/int/lst
 /response

 I know i should not use Content-Type as text/plain
 =


 Thanks


Re: Solr metrics in Codahale metrics and Graphite?

2014-03-16 Thread Greg Pendlebury
In the codahale metrics library there are 1, 5 and 15 minute moving
averages just like you would see in a tool like 'top'. However in Solr I
can only see 5 and 15 minute values, plus 'avgRequestsPerSecond'. I assumed
this was the 1 minute value initially, but it seems to be something like
the average since startup. I haven't looked thoroughly, but it is around 1%
of the other two in a normally idle test cluster after load tests have been
running for long enough that the 5 and 15 minute numbers match the load
testing throughput.

Is this difference deliberate? or an accident? or am I wrong entirely? I
can compute the overall average anyway, given that the stats also include
the start time of the search handler and the total search count, so I
thought it might be an accident.

Ta,
Greg





On 4 May 2013 01:19, Furkan KAMACI furkankam...@gmail.com wrote:

 Does anybody tested Ganglia with JMXTrans at production environment for
 SolrCloud?

 2013/4/26 Dmitry Kan solrexp...@gmail.com

  Alan, Shawn,
 
  If backporting to 3.x is hard, no worries, we don't necessarily require
 the
  patch as we are heading to 4.x eventually. It is just much easier within
  our organization to test on the existing solr 3.4 as there are a few of
  internal dependencies and custom code on top of solr. Also solr upgrades
 on
  production systems are usually pushed forward by a month or so starting
 the
  upgrade on development systems (requires lots of testing and
  verifications).
 
  Nevertheless, it is good effort to make #solr #graphite friendly, so keep
  it up! :)
 
  Dmitry
 
 
 
 
  On Thu, Apr 25, 2013 at 9:29 PM, Shawn Heisey s...@elyograg.org wrote:
 
   On 4/25/2013 6:30 AM, Dmitry Kan wrote:
We are very much interested in 3.4.
   
On Thu, Apr 25, 2013 at 12:55 PM, Alan Woodward a...@flax.co.uk
  wrote:
This is on top of trunk at the moment, but would be back ported to
 4.4
   if
there was interest.
  
   This will be bad news, I'm sorry:
  
   All remaining work on 3.x versions happens in the 3.6 branch. This
   branch is in maintenance mode.  It will only get fixes for serious bugs
   with no workaround.  Improvements and new features won't be considered
   at all.
  
   You're welcome to try backporting patches from newer issues.  Due to
 the
   major differences in the 3x and 4x codebases, the best case scenario is
   that you'll be facing a very manual task.  Some changes can't be
   backported because they rely on other features only found in 4.x code.
  
   Thanks,
   Shawn
  
  
 



Re: Solr metrics in Codahale metrics and Graphite?

2014-03-16 Thread Shalin Shekhar Mangar
Greg, SOLR-4735 (using the codahale metrics lib) hasn't been committed
yet. It is still work in progress.

Actually the internal Solr Metrics class has a method to return 1
minute stats but it is not used.

On Mon, Mar 17, 2014 at 10:06 AM, Greg Pendlebury
greg.pendleb...@gmail.com wrote:
 In the codahale metrics library there are 1, 5 and 15 minute moving
 averages just like you would see in a tool like 'top'. However in Solr I
 can only see 5 and 15 minute values, plus 'avgRequestsPerSecond'. I assumed
 this was the 1 minute value initially, but it seems to be something like
 the average since startup. I haven't looked thoroughly, but it is around 1%
 of the other two in a normally idle test cluster after load tests have been
 running for long enough that the 5 and 15 minute numbers match the load
 testing throughput.

 Is this difference deliberate? or an accident? or am I wrong entirely? I
 can compute the overall average anyway, given that the stats also include
 the start time of the search handler and the total search count, so I
 thought it might be an accident.

 Ta,
 Greg





 On 4 May 2013 01:19, Furkan KAMACI furkankam...@gmail.com wrote:

 Does anybody tested Ganglia with JMXTrans at production environment for
 SolrCloud?

 2013/4/26 Dmitry Kan solrexp...@gmail.com

  Alan, Shawn,
 
  If backporting to 3.x is hard, no worries, we don't necessarily require
 the
  patch as we are heading to 4.x eventually. It is just much easier within
  our organization to test on the existing solr 3.4 as there are a few of
  internal dependencies and custom code on top of solr. Also solr upgrades
 on
  production systems are usually pushed forward by a month or so starting
 the
  upgrade on development systems (requires lots of testing and
  verifications).
 
  Nevertheless, it is good effort to make #solr #graphite friendly, so keep
  it up! :)
 
  Dmitry
 
 
 
 
  On Thu, Apr 25, 2013 at 9:29 PM, Shawn Heisey s...@elyograg.org wrote:
 
   On 4/25/2013 6:30 AM, Dmitry Kan wrote:
We are very much interested in 3.4.
   
On Thu, Apr 25, 2013 at 12:55 PM, Alan Woodward a...@flax.co.uk
  wrote:
This is on top of trunk at the moment, but would be back ported to
 4.4
   if
there was interest.
  
   This will be bad news, I'm sorry:
  
   All remaining work on 3.x versions happens in the 3.6 branch. This
   branch is in maintenance mode.  It will only get fixes for serious bugs
   with no workaround.  Improvements and new features won't be considered
   at all.
  
   You're welcome to try backporting patches from newer issues.  Due to
 the
   major differences in the 3x and 4x codebases, the best case scenario is
   that you'll be facing a very manual task.  Some changes can't be
   backported because they rely on other features only found in 4.x code.
  
   Thanks,
   Shawn
  
  
 




-- 
Regards,
Shalin Shekhar Mangar.


Re: Solr metrics in Codahale metrics and Graphite?

2014-03-16 Thread Greg Pendlebury
Oh my bad. I thought it was already in. Thanks for the correction.

Ta,
Greg


On 17 March 2014 15:55, Shalin Shekhar Mangar shalinman...@gmail.comwrote:

 Greg, SOLR-4735 (using the codahale metrics lib) hasn't been committed
 yet. It is still work in progress.

 Actually the internal Solr Metrics class has a method to return 1
 minute stats but it is not used.

 On Mon, Mar 17, 2014 at 10:06 AM, Greg Pendlebury
 greg.pendleb...@gmail.com wrote:
  In the codahale metrics library there are 1, 5 and 15 minute moving
  averages just like you would see in a tool like 'top'. However in Solr I
  can only see 5 and 15 minute values, plus 'avgRequestsPerSecond'. I
 assumed
  this was the 1 minute value initially, but it seems to be something like
  the average since startup. I haven't looked thoroughly, but it is around
 1%
  of the other two in a normally idle test cluster after load tests have
 been
  running for long enough that the 5 and 15 minute numbers match the load
  testing throughput.
 
  Is this difference deliberate? or an accident? or am I wrong entirely? I
  can compute the overall average anyway, given that the stats also include
  the start time of the search handler and the total search count, so I
  thought it might be an accident.
 
  Ta,
  Greg
 
 
 
 
 
  On 4 May 2013 01:19, Furkan KAMACI furkankam...@gmail.com wrote:
 
  Does anybody tested Ganglia with JMXTrans at production environment for
  SolrCloud?
 
  2013/4/26 Dmitry Kan solrexp...@gmail.com
 
   Alan, Shawn,
  
   If backporting to 3.x is hard, no worries, we don't necessarily
 require
  the
   patch as we are heading to 4.x eventually. It is just much easier
 within
   our organization to test on the existing solr 3.4 as there are a few
 of
   internal dependencies and custom code on top of solr. Also solr
 upgrades
  on
   production systems are usually pushed forward by a month or so
 starting
  the
   upgrade on development systems (requires lots of testing and
   verifications).
  
   Nevertheless, it is good effort to make #solr #graphite friendly, so
 keep
   it up! :)
  
   Dmitry
  
  
  
  
   On Thu, Apr 25, 2013 at 9:29 PM, Shawn Heisey s...@elyograg.org
 wrote:
  
On 4/25/2013 6:30 AM, Dmitry Kan wrote:
 We are very much interested in 3.4.

 On Thu, Apr 25, 2013 at 12:55 PM, Alan Woodward a...@flax.co.uk
   wrote:
 This is on top of trunk at the moment, but would be back ported
 to
  4.4
if
 there was interest.
   
This will be bad news, I'm sorry:
   
All remaining work on 3.x versions happens in the 3.6 branch. This
branch is in maintenance mode.  It will only get fixes for serious
 bugs
with no workaround.  Improvements and new features won't be
 considered
at all.
   
You're welcome to try backporting patches from newer issues.  Due to
  the
major differences in the 3x and 4x codebases, the best case
 scenario is
that you'll be facing a very manual task.  Some changes can't be
backported because they rely on other features only found in 4.x
 code.
   
Thanks,
Shawn
   
   
  
 



 --
 Regards,
 Shalin Shekhar Mangar.