Re: Indexing slower on a better system

2017-02-13 Thread Zheng Lin Edwin Yeo
Ok no problem.

So you were saying that in your case, your indexing speed is also faster at
your MacBook Pro, as compared to your Amazon EC2 servers which has better
specifications?

Regards,
Edwin


On 14 February 2017 at 14:17, Walter Underwood 
wrote:

> Sorry. Haven’t used Windows since seven years ago and haven’t run Windows
> as a server for more than a decade.
>
> I would not recommend using Windows as your Solr OS. Windows is just not
> designed for that.
>
> wunder
> Walter Underwood
> wun...@wunderwood.org
> http://observer.wunderwood.org/  (my blog)
>
>
> > On Feb 13, 2017, at 10:12 PM, Zheng Lin Edwin Yeo 
> wrote:
> >
> > Hi Walter,
> >
> > For your suggestion to try out the time gunzip < solr-6.4.1.tgz >
> > /dev/null, does it works on Windows system? I tried on Windows, and it
> give
> > me the error "The syntax of the command is incorrect".
> >
> > In my current setup, if running on one trip, I can index about 16000
> lines
> > in a CSV file per minute on my laptop, but I can only index less than
> 1600
> > lines per minute on the server, which is more than 10 times slower.
> >
> > Regards,
> > Edwin
> >
> >
> >
> > On 14 February 2017 at 13:45, Zheng Lin Edwin Yeo 
> > wrote:
> >
> >> Thanks for the info.
> >>
> >> Yes, I'm running Solr 6.4.1 on both hosts.
> >>
> >> Regards,
> >> Edwin
> >>
> >>
> >> On 14 February 2017 at 13:21, Walter Underwood 
> >> wrote:
> >>
> >>> It is worth doing a basic CPU speed test. Once you have enough RAM,
> >>> indexing is mostly CPU-bound.
> >>>
> >>> Try something like this. Run it once to get the tgz file cached in OS
> >>> file buffers, then once to time it.
> >>>
> >>> time gunzip < solr-6.4.1.tgz > /dev/null
> >>>
> >>> I get 1.3 seconds on an Amazon c4.8xlarge and 0.8 seconds on my
> MacBook.
> >>> A bigger file would be a better test, but that is the general idea.
> >>>
> >>> Also, are you running 6.4.1 on both hosts? The new metrics code caused
> >>> some slowdowns from 6.3.0 to 6.4.0.
> >>>
> >>> On the other hand, I’m indexing about a million documents per minute
> into
> >>> a 16 node cluster (4 shards, 4-way replication factor) built with the
> >>> c4.8xlarge instances. I’m running 64 indexing threads and 1000 doc
> batches.
> >>> It might go a bit faster after we switch the cloud driver in SolrJ.
> >>>
> >>> wunder
> >>> Walter Underwood
> >>> wun...@wunderwood.org
> >>> http://observer.wunderwood.org/  (my blog)
> >>>
> >>>
>  On Feb 13, 2017, at 9:10 PM, Zheng Lin Edwin Yeo <
> edwinye...@gmail.com>
> >>> wrote:
> 
>  No, currently the server is slower, and my laptop is faster.
> 
>  But shouldn't the server be faster, since it has a much better
>  specification, like more RAM, better processor and SSD drive.
> 
>  Regards,
>  Edwin
> 
> 
>  On 14 February 2017 at 12:26, Walter Underwood  >
>  wrote:
> 
> > Are you sure the server is faster? My MacBook Pro is a lot faster
> than
> > many of our Amazon EC2 servers.
> >
> > wunder
> > Walter Underwood
> > wun...@wunderwood.org
> > http://observer.wunderwood.org/  (my blog)
> >
> >
> >> On Feb 13, 2017, at 8:12 PM, Zheng Lin Edwin Yeo <
> >>> edwinye...@gmail.com>
> > wrote:
> >>
> >> Hi,
> >>
> >> I'm facing the issue of the indexing speed is slower is slower on a
> > server
> >> with a much better specification with Solr running on SSD, as
> compared
> > to a
> >> laptop with a normal hard disk.
> >>
> >> Both the system has the exact same configurations. The
> configurations
> >>> are
> >> first setup on the laptop, before being replicate to the server.
> >>
> >> The setup is Solr 6.4.1, of 1 shard with 2 replica, using external
> >> ZooKeeper 3.4.8. The only difference is that in my laptop, both the
> > shards
> >> and ZooKeeper are on the same hard disk, while a the server, the
> > ZooKeeper
> >> is running on it's own hard disk, and each of the shards are also
> >>> running
> >> on a separate hard disk. From what I know, this configuration should
> > result
> >> in improving the performance, instead of making it worse?
> >>
> >> What could be the other reasons that this could happen?
> >>
> >> I'm running on Solr 6.4.1
> >>
> >> Regards,
> >> Edwin
> >
> >
> >>>
> >>>
> >>
>
>


Re: Indexing slower on a better system

2017-02-13 Thread Walter Underwood
Sorry. Haven’t used Windows since seven years ago and haven’t run Windows as a 
server for more than a decade.

I would not recommend using Windows as your Solr OS. Windows is just not 
designed for that.

wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/  (my blog)


> On Feb 13, 2017, at 10:12 PM, Zheng Lin Edwin Yeo  
> wrote:
> 
> Hi Walter,
> 
> For your suggestion to try out the time gunzip < solr-6.4.1.tgz >
> /dev/null, does it works on Windows system? I tried on Windows, and it give
> me the error "The syntax of the command is incorrect".
> 
> In my current setup, if running on one trip, I can index about 16000 lines
> in a CSV file per minute on my laptop, but I can only index less than 1600
> lines per minute on the server, which is more than 10 times slower.
> 
> Regards,
> Edwin
> 
> 
> 
> On 14 February 2017 at 13:45, Zheng Lin Edwin Yeo 
> wrote:
> 
>> Thanks for the info.
>> 
>> Yes, I'm running Solr 6.4.1 on both hosts.
>> 
>> Regards,
>> Edwin
>> 
>> 
>> On 14 February 2017 at 13:21, Walter Underwood 
>> wrote:
>> 
>>> It is worth doing a basic CPU speed test. Once you have enough RAM,
>>> indexing is mostly CPU-bound.
>>> 
>>> Try something like this. Run it once to get the tgz file cached in OS
>>> file buffers, then once to time it.
>>> 
>>> time gunzip < solr-6.4.1.tgz > /dev/null
>>> 
>>> I get 1.3 seconds on an Amazon c4.8xlarge and 0.8 seconds on my MacBook.
>>> A bigger file would be a better test, but that is the general idea.
>>> 
>>> Also, are you running 6.4.1 on both hosts? The new metrics code caused
>>> some slowdowns from 6.3.0 to 6.4.0.
>>> 
>>> On the other hand, I’m indexing about a million documents per minute into
>>> a 16 node cluster (4 shards, 4-way replication factor) built with the
>>> c4.8xlarge instances. I’m running 64 indexing threads and 1000 doc batches.
>>> It might go a bit faster after we switch the cloud driver in SolrJ.
>>> 
>>> wunder
>>> Walter Underwood
>>> wun...@wunderwood.org
>>> http://observer.wunderwood.org/  (my blog)
>>> 
>>> 
 On Feb 13, 2017, at 9:10 PM, Zheng Lin Edwin Yeo 
>>> wrote:
 
 No, currently the server is slower, and my laptop is faster.
 
 But shouldn't the server be faster, since it has a much better
 specification, like more RAM, better processor and SSD drive.
 
 Regards,
 Edwin
 
 
 On 14 February 2017 at 12:26, Walter Underwood 
 wrote:
 
> Are you sure the server is faster? My MacBook Pro is a lot faster than
> many of our Amazon EC2 servers.
> 
> wunder
> Walter Underwood
> wun...@wunderwood.org
> http://observer.wunderwood.org/  (my blog)
> 
> 
>> On Feb 13, 2017, at 8:12 PM, Zheng Lin Edwin Yeo <
>>> edwinye...@gmail.com>
> wrote:
>> 
>> Hi,
>> 
>> I'm facing the issue of the indexing speed is slower is slower on a
> server
>> with a much better specification with Solr running on SSD, as compared
> to a
>> laptop with a normal hard disk.
>> 
>> Both the system has the exact same configurations. The configurations
>>> are
>> first setup on the laptop, before being replicate to the server.
>> 
>> The setup is Solr 6.4.1, of 1 shard with 2 replica, using external
>> ZooKeeper 3.4.8. The only difference is that in my laptop, both the
> shards
>> and ZooKeeper are on the same hard disk, while a the server, the
> ZooKeeper
>> is running on it's own hard disk, and each of the shards are also
>>> running
>> on a separate hard disk. From what I know, this configuration should
> result
>> in improving the performance, instead of making it worse?
>> 
>> What could be the other reasons that this could happen?
>> 
>> I'm running on Solr 6.4.1
>> 
>> Regards,
>> Edwin
> 
> 
>>> 
>>> 
>> 



Re: Indexing slower on a better system

2017-02-13 Thread Zheng Lin Edwin Yeo
Hi Walter,

For your suggestion to try out the time gunzip < solr-6.4.1.tgz >
/dev/null, does it works on Windows system? I tried on Windows, and it give
me the error "The syntax of the command is incorrect".

In my current setup, if running on one trip, I can index about 16000 lines
in a CSV file per minute on my laptop, but I can only index less than 1600
lines per minute on the server, which is more than 10 times slower.

Regards,
Edwin



On 14 February 2017 at 13:45, Zheng Lin Edwin Yeo 
wrote:

> Thanks for the info.
>
> Yes, I'm running Solr 6.4.1 on both hosts.
>
> Regards,
> Edwin
>
>
> On 14 February 2017 at 13:21, Walter Underwood 
> wrote:
>
>> It is worth doing a basic CPU speed test. Once you have enough RAM,
>> indexing is mostly CPU-bound.
>>
>> Try something like this. Run it once to get the tgz file cached in OS
>> file buffers, then once to time it.
>>
>> time gunzip < solr-6.4.1.tgz > /dev/null
>>
>> I get 1.3 seconds on an Amazon c4.8xlarge and 0.8 seconds on my MacBook.
>> A bigger file would be a better test, but that is the general idea.
>>
>> Also, are you running 6.4.1 on both hosts? The new metrics code caused
>> some slowdowns from 6.3.0 to 6.4.0.
>>
>> On the other hand, I’m indexing about a million documents per minute into
>> a 16 node cluster (4 shards, 4-way replication factor) built with the
>> c4.8xlarge instances. I’m running 64 indexing threads and 1000 doc batches.
>> It might go a bit faster after we switch the cloud driver in SolrJ.
>>
>> wunder
>> Walter Underwood
>> wun...@wunderwood.org
>> http://observer.wunderwood.org/  (my blog)
>>
>>
>> > On Feb 13, 2017, at 9:10 PM, Zheng Lin Edwin Yeo 
>> wrote:
>> >
>> > No, currently the server is slower, and my laptop is faster.
>> >
>> > But shouldn't the server be faster, since it has a much better
>> > specification, like more RAM, better processor and SSD drive.
>> >
>> > Regards,
>> > Edwin
>> >
>> >
>> > On 14 February 2017 at 12:26, Walter Underwood 
>> > wrote:
>> >
>> >> Are you sure the server is faster? My MacBook Pro is a lot faster than
>> >> many of our Amazon EC2 servers.
>> >>
>> >> wunder
>> >> Walter Underwood
>> >> wun...@wunderwood.org
>> >> http://observer.wunderwood.org/  (my blog)
>> >>
>> >>
>> >>> On Feb 13, 2017, at 8:12 PM, Zheng Lin Edwin Yeo <
>> edwinye...@gmail.com>
>> >> wrote:
>> >>>
>> >>> Hi,
>> >>>
>> >>> I'm facing the issue of the indexing speed is slower is slower on a
>> >> server
>> >>> with a much better specification with Solr running on SSD, as compared
>> >> to a
>> >>> laptop with a normal hard disk.
>> >>>
>> >>> Both the system has the exact same configurations. The configurations
>> are
>> >>> first setup on the laptop, before being replicate to the server.
>> >>>
>> >>> The setup is Solr 6.4.1, of 1 shard with 2 replica, using external
>> >>> ZooKeeper 3.4.8. The only difference is that in my laptop, both the
>> >> shards
>> >>> and ZooKeeper are on the same hard disk, while a the server, the
>> >> ZooKeeper
>> >>> is running on it's own hard disk, and each of the shards are also
>> running
>> >>> on a separate hard disk. From what I know, this configuration should
>> >> result
>> >>> in improving the performance, instead of making it worse?
>> >>>
>> >>> What could be the other reasons that this could happen?
>> >>>
>> >>> I'm running on Solr 6.4.1
>> >>>
>> >>> Regards,
>> >>> Edwin
>> >>
>> >>
>>
>>
>


Re: Division in JSON Facet

2017-02-13 Thread Zheng Lin Edwin Yeo
I found that we can't put div(4,2) directly, as it wouldn't work.

It will work if I put something like max(div(4,2)).

Regards,
Edwin


On 10 January 2017 at 19:59, Zheng Lin Edwin Yeo 
wrote:

> Hi,
>
> I'm getting this error when I tried to do a division in JSON Facet.
>
>   "error":{
> "msg":"org.apache.solr.search.SyntaxError: Unknown aggregation agg_div in 
> ('div(4,2)', pos=4)",
> "code":400}}
>
>
> Is this division function supported in JSON Facet?
>
> I'm using this in Solr 5.4.0
>
> Regards,
> Edwin
>


Re: Continual garbage collection loop

2017-02-13 Thread Erick Erickson
Why is this a problem? Are you seeing unacceptable slowness?
It's fairly common for Java to frequently do GC, the problem happens
when it uses stop-the-world GC. So unless you're seeing visibly
slow performance I'd say ignore it.

Curiously, increasing the Java heap a little bit sometimes helps
as I've seen situations where the GC recovers so little memory that
another GC cycle immediately occurs. That said I don't see evidence
of this in what you showed.

GCViewer is a nifty tool for visualizing the GC activity BTW.

Best,
Erick

On Mon, Feb 13, 2017 at 8:36 AM, Leon STRINGER
 wrote:
> Hi,
>
> I get an issue where, when I'm deleting and adding Solr cores, it appears to 
> go
> into a loop increasing CPU load and continually (every 2 seconds) logging to 
> the
> garbage collection log.
>
> I had this problem with 6.1.0 so we've just upgraded to 6.4.1 and the issue
> still occurs. The entries being logged every 2 seconds are below (hope it's 
> not
> too verbose). Obviously this means the log gets big quickly.
>
> We can work around the issue by restarting Solr but presumably something has
> gone wrong. Can anyone suggest if we're doing something incorrectly to cause
> this, or if it's an issue we can troubleshoot.
>
> Any advice gratefully received.
>
> On CentOS 7 with OpenJDK 1.8.0_91-b14.
>
> solr_gc.log.0.current logs the following every 2 seconds:
>
> 2017-02-13T16:19:11.230+: 5092.640: [GC (CMS Initial Mark) [1
> CMS-initial-mark: 225270K(393216K)] 225280K(502464K), 0.0030517 secs] [Times:
> user=0.01 sys=0.00, real=0.01 secs]
> 2017-02-13T16:19:11.234+: 5092.643: Total time for which application 
> threads
> were stopped: 0.0033800 seconds, Stopping threads took: 0.473 seconds
> 2017-02-13T16:19:11.234+: 5092.643: [CMS-concurrent-mark-start]
> 2017-02-13T16:19:11.359+: 5092.769: [CMS-concurrent-mark: 0.125/0.125 
> secs]
> [Times: user=0.50 sys=0.00, real=0.12 secs]
> 2017-02-13T16:19:11.359+: 5092.769: [CMS-concurrent-preclean-start]
> 2017-02-13T16:19:11.361+: 5092.771: [CMS-concurrent-preclean: 0.002/0.002
> secs] [Times: user=0.00 sys=0.00, real=0.00 secs]
> 2017-02-13T16:19:11.362+: 5092.771: [GC (CMS Final Remark) [YG occupancy: 
> 10
> K (109248 K)]{Heap before GC invocations=3236 (full 1150):
> par new generation total 109248K, used 10K [0xe000,
> 0xe800, 0xe800)
> eden space 87424K, 0% used [0xe000, 0xe0001020,
> 0xe556)
> from space 21824K, 0% used [0xe6ab, 0xe6ab1830,
> 0xe800)
> to space 21824K, 0% used [0xe556, 0xe556,
> 0xe6ab)
> concurrent mark-sweep generation total 393216K, used 225270K
> [0xe800, 0x0001, 0x0001)
> Metaspace used 176850K, capacity 179580K, committed 181092K, reserved 1210368K
> class space used 18794K, capacity 19506K, committed 19836K, reserved 1048576K
> 2017-02-13T16:19:11.362+: 5092.771: [GC (CMS Final Remark)
> 2017-02-13T16:19:11.362+: 5092.771: [ParNew
> Desired survivor size 20112992 bytes, new threshold 8 (max 8)
> - age 2: 160 bytes, 160 total
> - age 4: 32 bytes, 192 total
> : 10K->6K(109248K), 0.0041872 secs] 225280K->225276K(502464K), 0.0042455 secs]
> [Times: user=0.01 sys=0.00, real=0.01 secs]
> Heap after GC invocations=3237 (full 1150):
> par new generation total 109248K, used 6K [0xe000,
> 0xe800, 0xe800)
> eden space 87424K, 0% used [0xe000, 0xe000,
> 0xe556)
> from space 21824K, 0% used [0xe556, 0xe5561830,
> 0xe6ab)
> to space 21824K, 0% used [0xe6ab, 0xe6ab,
> 0xe800)
> concurrent mark-sweep generation total 393216K, used 225270K
> [0xe800, 0x0001, 0x0001)
> Metaspace used 176850K, capacity 179580K, committed 181092K, reserved 1210368K
> class space used 18794K, capacity 19506K, committed 19836K, reserved 1048576K
> }
> 2017-02-13T16:19:11.366+: 5092.775: [Rescan (parallel) , 0.0018980
> secs]2017-02-13T16:19:11.368+: 5092.777: [weak refs processing, 0.0004940
> secs]2017-02-13T16:19:11.368+: 5092.778: [class unloading, 0.0580950
> secs]2017-02-13T16:19:11.426+: 5092.836: [scrub symbol table, 0.0110875
> secs]2017-02-13T16:19:11.438+: 5092.847: [scrub string table, 0.0019072
> secs][1 CMS-remark: 225270K(393216K)] 225276K(502464K), 0.0780250 secs] 
> [Times:
> user=0.09 sys=0.00, real=0.08 secs]
> 2017-02-13T16:19:11.440+: 5092.849: Total time for which application 
> threads
> were stopped: 0.0782677 seconds, Stopping threads took: 0.411 seconds
> 2017-02-13T16:19:11.440+: 5092.849: [CMS-concurrent-sweep-start]
> 2017-02-13T16:19:11.546+: 5092.955: [CMS-concurrent-sweep: 0.106/0.106 
> secs]
> [Times: user=0.11 sys=0.00, real=0.11 secs]
> 2017-02-13T16:19:11.546+: 5092.955: 

Re: Indexing slower on a better system

2017-02-13 Thread Zheng Lin Edwin Yeo
Thanks for the info.

Yes, I'm running Solr 6.4.1 on both hosts.

Regards,
Edwin


On 14 February 2017 at 13:21, Walter Underwood 
wrote:

> It is worth doing a basic CPU speed test. Once you have enough RAM,
> indexing is mostly CPU-bound.
>
> Try something like this. Run it once to get the tgz file cached in OS file
> buffers, then once to time it.
>
> time gunzip < solr-6.4.1.tgz > /dev/null
>
> I get 1.3 seconds on an Amazon c4.8xlarge and 0.8 seconds on my MacBook. A
> bigger file would be a better test, but that is the general idea.
>
> Also, are you running 6.4.1 on both hosts? The new metrics code caused
> some slowdowns from 6.3.0 to 6.4.0.
>
> On the other hand, I’m indexing about a million documents per minute into
> a 16 node cluster (4 shards, 4-way replication factor) built with the
> c4.8xlarge instances. I’m running 64 indexing threads and 1000 doc batches.
> It might go a bit faster after we switch the cloud driver in SolrJ.
>
> wunder
> Walter Underwood
> wun...@wunderwood.org
> http://observer.wunderwood.org/  (my blog)
>
>
> > On Feb 13, 2017, at 9:10 PM, Zheng Lin Edwin Yeo 
> wrote:
> >
> > No, currently the server is slower, and my laptop is faster.
> >
> > But shouldn't the server be faster, since it has a much better
> > specification, like more RAM, better processor and SSD drive.
> >
> > Regards,
> > Edwin
> >
> >
> > On 14 February 2017 at 12:26, Walter Underwood 
> > wrote:
> >
> >> Are you sure the server is faster? My MacBook Pro is a lot faster than
> >> many of our Amazon EC2 servers.
> >>
> >> wunder
> >> Walter Underwood
> >> wun...@wunderwood.org
> >> http://observer.wunderwood.org/  (my blog)
> >>
> >>
> >>> On Feb 13, 2017, at 8:12 PM, Zheng Lin Edwin Yeo  >
> >> wrote:
> >>>
> >>> Hi,
> >>>
> >>> I'm facing the issue of the indexing speed is slower is slower on a
> >> server
> >>> with a much better specification with Solr running on SSD, as compared
> >> to a
> >>> laptop with a normal hard disk.
> >>>
> >>> Both the system has the exact same configurations. The configurations
> are
> >>> first setup on the laptop, before being replicate to the server.
> >>>
> >>> The setup is Solr 6.4.1, of 1 shard with 2 replica, using external
> >>> ZooKeeper 3.4.8. The only difference is that in my laptop, both the
> >> shards
> >>> and ZooKeeper are on the same hard disk, while a the server, the
> >> ZooKeeper
> >>> is running on it's own hard disk, and each of the shards are also
> running
> >>> on a separate hard disk. From what I know, this configuration should
> >> result
> >>> in improving the performance, instead of making it worse?
> >>>
> >>> What could be the other reasons that this could happen?
> >>>
> >>> I'm running on Solr 6.4.1
> >>>
> >>> Regards,
> >>> Edwin
> >>
> >>
>
>


Re: Indexing slower on a better system

2017-02-13 Thread Walter Underwood
It is worth doing a basic CPU speed test. Once you have enough RAM, indexing is 
mostly CPU-bound.

Try something like this. Run it once to get the tgz file cached in OS file 
buffers, then once to time it.

time gunzip < solr-6.4.1.tgz > /dev/null

I get 1.3 seconds on an Amazon c4.8xlarge and 0.8 seconds on my MacBook. A 
bigger file would be a better test, but that is the general idea.

Also, are you running 6.4.1 on both hosts? The new metrics code caused some 
slowdowns from 6.3.0 to 6.4.0.

On the other hand, I’m indexing about a million documents per minute into a 16 
node cluster (4 shards, 4-way replication factor) built with the c4.8xlarge 
instances. I’m running 64 indexing threads and 1000 doc batches. It might go a 
bit faster after we switch the cloud driver in SolrJ.

wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/  (my blog)


> On Feb 13, 2017, at 9:10 PM, Zheng Lin Edwin Yeo  wrote:
> 
> No, currently the server is slower, and my laptop is faster.
> 
> But shouldn't the server be faster, since it has a much better
> specification, like more RAM, better processor and SSD drive.
> 
> Regards,
> Edwin
> 
> 
> On 14 February 2017 at 12:26, Walter Underwood 
> wrote:
> 
>> Are you sure the server is faster? My MacBook Pro is a lot faster than
>> many of our Amazon EC2 servers.
>> 
>> wunder
>> Walter Underwood
>> wun...@wunderwood.org
>> http://observer.wunderwood.org/  (my blog)
>> 
>> 
>>> On Feb 13, 2017, at 8:12 PM, Zheng Lin Edwin Yeo 
>> wrote:
>>> 
>>> Hi,
>>> 
>>> I'm facing the issue of the indexing speed is slower is slower on a
>> server
>>> with a much better specification with Solr running on SSD, as compared
>> to a
>>> laptop with a normal hard disk.
>>> 
>>> Both the system has the exact same configurations. The configurations are
>>> first setup on the laptop, before being replicate to the server.
>>> 
>>> The setup is Solr 6.4.1, of 1 shard with 2 replica, using external
>>> ZooKeeper 3.4.8. The only difference is that in my laptop, both the
>> shards
>>> and ZooKeeper are on the same hard disk, while a the server, the
>> ZooKeeper
>>> is running on it's own hard disk, and each of the shards are also running
>>> on a separate hard disk. From what I know, this configuration should
>> result
>>> in improving the performance, instead of making it worse?
>>> 
>>> What could be the other reasons that this could happen?
>>> 
>>> I'm running on Solr 6.4.1
>>> 
>>> Regards,
>>> Edwin
>> 
>> 



Re: Indexing slower on a better system

2017-02-13 Thread Zheng Lin Edwin Yeo
No, currently the server is slower, and my laptop is faster.

But shouldn't the server be faster, since it has a much better
specification, like more RAM, better processor and SSD drive.

Regards,
Edwin


On 14 February 2017 at 12:26, Walter Underwood 
wrote:

> Are you sure the server is faster? My MacBook Pro is a lot faster than
> many of our Amazon EC2 servers.
>
> wunder
> Walter Underwood
> wun...@wunderwood.org
> http://observer.wunderwood.org/  (my blog)
>
>
> > On Feb 13, 2017, at 8:12 PM, Zheng Lin Edwin Yeo 
> wrote:
> >
> > Hi,
> >
> > I'm facing the issue of the indexing speed is slower is slower on a
> server
> > with a much better specification with Solr running on SSD, as compared
> to a
> > laptop with a normal hard disk.
> >
> > Both the system has the exact same configurations. The configurations are
> > first setup on the laptop, before being replicate to the server.
> >
> > The setup is Solr 6.4.1, of 1 shard with 2 replica, using external
> > ZooKeeper 3.4.8. The only difference is that in my laptop, both the
> shards
> > and ZooKeeper are on the same hard disk, while a the server, the
> ZooKeeper
> > is running on it's own hard disk, and each of the shards are also running
> > on a separate hard disk. From what I know, this configuration should
> result
> > in improving the performance, instead of making it worse?
> >
> > What could be the other reasons that this could happen?
> >
> > I'm running on Solr 6.4.1
> >
> > Regards,
> > Edwin
>
>


Re: solrj5.50 query excepion

2017-02-13 Thread Ray Niu
Spellcheck reponse format was changed since 5.0,not backward compatible
alias <524839...@qq.com>于2017年2月13日 周一下午6:05写道:

> hi  I use solrj 5.5.0 to inquire solr3.6 reported the following error:
> Java.lang.ClassCastException: java.lang.Boolean can not be cast to
> org.apache.solr.common.util.NamedList
> At org.apache.solr.client.solrj.response.SpellCheckResponse.
>  (SpellCheckResponse.java:47)
> At
> org.apache.solr.client.solrj.response.QueryResponse.extractSpellCheckInfo
> (QueryResponse.java:179)
> At org.apache.solr.client.solrj.response.QueryResponse.setResponse
> (QueryResponse.java:153)
> At org.apache.solr.client.solrj.SolrRequest.process
> (SolrRequest.java:149)
> At org.apache.solr.client.solrj.SolrClient.query
> (SolrClient.java:942)
> At org.apache.solr.client.solrj.SolrClient.query
> (SolrClient.java:957)
> At com.vip.vipme.demo.utils.SolrTest.testCategoryIdPC
> (SolrTest.java:66)
> At com.vip.vipme.demo.SolrjServlet1.doGet (SolrjServlet1.java:33)
> At javax.servlet.http.HttpServlet.service (HttpServlet.java:707)
> At javax.servlet.http.HttpServlet.service (HttpServlet.java:820)
> At org.mortbay.jetty.servlet.ServletHolder.handle
> (ServletHolder.java:487)
> At org.mortbay.jetty.servlet.ServletHandler.handle
> (ServletHandler.java:362)
> At org.mortbay.jetty.security.SecurityHandler.handle
> (SecurityHandler.java:216)
> At org.mortbay.jetty.servlet.SessionHandler.handle
> (SessionHandler.java:181)
> At org.mortbay.jetty.handler.ContextHandler.handle
> (ContextHandler.java:712)
> At org.mortbay.jetty.webapp.WebAppContext.handle
> (WebAppContext.java:405)
>
>
> If you set the query.set ("spellcheck", Boolean.FALSE); can solve this
> problem,
> But I would like to know what the specific reasons for this problem
>
>
> thinks


Re: Indexing slower on a better system

2017-02-13 Thread Walter Underwood
Are you sure the server is faster? My MacBook Pro is a lot faster than many of 
our Amazon EC2 servers.

wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/  (my blog)


> On Feb 13, 2017, at 8:12 PM, Zheng Lin Edwin Yeo  wrote:
> 
> Hi,
> 
> I'm facing the issue of the indexing speed is slower is slower on a server
> with a much better specification with Solr running on SSD, as compared to a
> laptop with a normal hard disk.
> 
> Both the system has the exact same configurations. The configurations are
> first setup on the laptop, before being replicate to the server.
> 
> The setup is Solr 6.4.1, of 1 shard with 2 replica, using external
> ZooKeeper 3.4.8. The only difference is that in my laptop, both the shards
> and ZooKeeper are on the same hard disk, while a the server, the ZooKeeper
> is running on it's own hard disk, and each of the shards are also running
> on a separate hard disk. From what I know, this configuration should result
> in improving the performance, instead of making it worse?
> 
> What could be the other reasons that this could happen?
> 
> I'm running on Solr 6.4.1
> 
> Regards,
> Edwin



Re: what is the bottleneck of solr

2017-02-13 Thread Shawn Heisey
On 2/13/2017 8:27 PM, 跳舞的水滴 wrote:
> I want to get some proposal about solr. How big is the size of solr
> cluster when solr performance will meet bottleneck? Or how can I to
> estimate the suggested cluster size? Could you give me some advice? 

It is not possible to provide general information regarding hardware and
index sizing.  There are too many variables that will affect the
performance.

https://lucidworks.com/blog/sizing-hardware-in-the-abstract-why-we-dont-have-a-definitive-answer/

Memory is by far the most precious resource for Solr performance.  An
ideal setup will have enough *extra* memory (beyond the Java heap) for
the operating system to cache the entire index in RAM.  The ideal setup
is rarely actually required for decent performance, though.

Thanks,
Shawn



Indexing slower on a better system

2017-02-13 Thread Zheng Lin Edwin Yeo
Hi,

I'm facing the issue of the indexing speed is slower is slower on a server
with a much better specification with Solr running on SSD, as compared to a
laptop with a normal hard disk.

Both the system has the exact same configurations. The configurations are
first setup on the laptop, before being replicate to the server.

The setup is Solr 6.4.1, of 1 shard with 2 replica, using external
ZooKeeper 3.4.8. The only difference is that in my laptop, both the shards
and ZooKeeper are on the same hard disk, while a the server, the ZooKeeper
is running on it's own hard disk, and each of the shards are also running
on a separate hard disk. From what I know, this configuration should result
in improving the performance, instead of making it worse?

What could be the other reasons that this could happen?

I'm running on Solr 6.4.1

Regards,
Edwin


what is the bottleneck of solr

2017-02-13 Thread ??????????
Hi,
   I want to get some proposal about solr. How big is the size of solr cluster 
when solr performance will meet bottleneck?? Or how can I to estimate the 
suggested cluster size??
   Could you give me some advice?
   Sincerely.


Best Regards
Daisy

solrj5.50 query excepion

2017-02-13 Thread alias
hi  I use solrj 5.5.0 to inquire solr3.6 reported the following error:
Java.lang.ClassCastException: java.lang.Boolean can not be cast to 
org.apache.solr.common.util.NamedList
At org.apache.solr.client.solrj.response.SpellCheckResponse.  
(SpellCheckResponse.java:47)
At 
org.apache.solr.client.solrj.response.QueryResponse.extractSpellCheckInfo 
(QueryResponse.java:179)
At org.apache.solr.client.solrj.response.QueryResponse.setResponse 
(QueryResponse.java:153)
At org.apache.solr.client.solrj.SolrRequest.process 
(SolrRequest.java:149)
At org.apache.solr.client.solrj.SolrClient.query (SolrClient.java:942)
At org.apache.solr.client.solrj.SolrClient.query (SolrClient.java:957)
At com.vip.vipme.demo.utils.SolrTest.testCategoryIdPC (SolrTest.java:66)
At com.vip.vipme.demo.SolrjServlet1.doGet (SolrjServlet1.java:33)
At javax.servlet.http.HttpServlet.service (HttpServlet.java:707)
At javax.servlet.http.HttpServlet.service (HttpServlet.java:820)
At org.mortbay.jetty.servlet.ServletHolder.handle 
(ServletHolder.java:487)
At org.mortbay.jetty.servlet.ServletHandler.handle 
(ServletHandler.java:362)
At org.mortbay.jetty.security.SecurityHandler.handle 
(SecurityHandler.java:216)
At org.mortbay.jetty.servlet.SessionHandler.handle 
(SessionHandler.java:181)
At org.mortbay.jetty.handler.ContextHandler.handle 
(ContextHandler.java:712)
At org.mortbay.jetty.webapp.WebAppContext.handle 
(WebAppContext.java:405)


If you set the query.set ("spellcheck", Boolean.FALSE); can solve this problem,
But I would like to know what the specific reasons for this problem


thinks

Migrate Documents to Another Collection

2017-02-13 Thread alias
hi  I use solrj 5.5.0 to inquire solr3.6 reported the following error:
Java.lang.ClassCastException: java.lang.Boolean can not be cast to 
org.apache.solr.common.util.NamedList
At org.apache.solr.client.solrj.response.SpellCheckResponse.  
(SpellCheckResponse.java:47)
At 
org.apache.solr.client.solrj.response.QueryResponse.extractSpellCheckInfo 
(QueryResponse.java:179)
At org.apache.solr.client.solrj.response.QueryResponse.setResponse 
(QueryResponse.java:153)
At org.apache.solr.client.solrj.SolrRequest.process 
(SolrRequest.java:149)
At org.apache.solr.client.solrj.SolrClient.query (SolrClient.java:942)
At org.apache.solr.client.solrj.SolrClient.query (SolrClient.java:957)
At com.vip.vipme.demo.utils.SolrTest.testCategoryIdPC (SolrTest.java:66)
At com.vip.vipme.demo.SolrjServlet1.doGet (SolrjServlet1.java:33)
At javax.servlet.http.HttpServlet.service (HttpServlet.java:707)
At javax.servlet.http.HttpServlet.service (HttpServlet.java:820)
At org.mortbay.jetty.servlet.ServletHolder.handle 
(ServletHolder.java:487)
At org.mortbay.jetty.servlet.ServletHandler.handle 
(ServletHandler.java:362)
At org.mortbay.jetty.security.SecurityHandler.handle 
(SecurityHandler.java:216)
At org.mortbay.jetty.servlet.SessionHandler.handle 
(SessionHandler.java:181)
At org.mortbay.jetty.handler.ContextHandler.handle 
(ContextHandler.java:712)
At org.mortbay.jetty.webapp.WebAppContext.handle 
(WebAppContext.java:405)


If you set the query.set ("spellcheck", Boolean.FALSE); can solve this problem,
But I would like to know what the specific reasons for this problem


thinks

Re: Heads up: SOLR-10130, Performance issue in Solr 6.4.1

2017-02-13 Thread Andrzej Białecki

> On 13 Feb 2017, at 13:46, Ere Maijala  wrote:
> 
> Hi all,
> 
> this is just a quick heads-up that we've stumbled on serious performance 
> issues after upgrading to Solr 6.4.1 apparently due to the new metrics 
> collection causing a major slowdown. I've filed an issue 
> (https://issues.apache.org/jira/browse/SOLR-10130) about it, but decided to 
> post this just so that anyone else doesn't need to encounter this unprepared. 
> It seems to me that metrics would need to be explicitly disabled altogether 
> in the index config to avoid the issue.
> 
> --Ere


Unfortunately this bug is present in both 6.4.0 and 6.4.1, and needs a patch, 
ie. config changes won’t solve it.

It’s a pity that Solr doesn’t have a continuous performance benchmark setup, 
like Lucene does.

--
Best regards,
Andrzej Bialecki

--=# http://www.lucidworks.com #=--



Re: Issues with Solr Morphline reading RFC822 files

2017-02-13 Thread Dave
Can't see what's color coded in the email. 

> On Feb 13, 2017, at 5:35 PM, Anatharaman, Srinatha (Contractor) 
>  wrote:
> 
> Hi,
> 
> I am loading email files which are in RFC822 format into SolrCloud using Flume
> But some meta data of the emails is not getting loaded to Solr.
> Please find below sample email, text which is colored in Bold Red is ignored 
> by Solr
> I can read this files ONLY using org.apache.tika.parser.mail.RFC822Parser 
> Parser, If I want to read it using TXTparser Solr ignores the files with 
> error "No supported MIME type found for _attachment_mimetype=message/rfc822"
> 
> How do I overcome this issue? I want to read the emails files without losing 
> single word from the file
> 
> Received: from resqmta-po-08v.sys..net ([196.114.154.167])
>by csp-imta02.westchester.pa.bo..net with bizsmtp
>id EClZ1u0013cy81c01E9enp; Wed, 30 Nov 2016 14:09:38 +
> Received: from resimta-po-14v.sys. .net ([96.114.154.142])
>by resqmta-po-08v.sys..net with SMTP
>id C5ZqcRB3e2dNjC5ZqcQvHl; Wed, 30 Nov 2016 14:09:38 +
> Received: from outgoingemail1.digitalrightscorp.com ([69.36.73.150])
>by resimta-po-14v.sys..net with SMTP
>id C5ZNcJfg9npCYC5Zcceh9K; Wed, 30 Nov 2016 14:09:25 +
> X-Xfinity-Message-Heuristics: IPv6:N;TLS=0;SPF=0;DMARC=
> Received: from outgoingemail1-69-150 (localhost [127.0.0.1])
>by outgoingemail1. XRightsCorp.com (Postfix) with ESMTP id 
> 15EB7100419
>for ; Wed, 30 Nov 2016 06:05:52 -0800 (PST)
> From: a...@xrightscorp.com
> To: d...@.net
> Message-ID: <551271522.6.1480514752082.JavaMail.root@outgoingemail1-69-150>
> Subject: Unauthorized Use of Copyrights RE:
> TC-cc0ae97d-8918-4a4b-8515-749ff9303bc0
> MIME-Version: 1.0
> Content-Type: text/plain; charset=us-ascii
> Content-Transfer-Encoding: 7bit
> Date: Wed, 30 Nov 2016 06:05:52 -0800 (PST)
> X-CMAE-Envelope: 
> MS4wfAIoEnMl1VVV7nPS/7pis5Gr/ijSjTNaioaGiZVCAo4cXRoeTl9Z1Nt8SYSY4kX7RpDlZuxzGbzyeRDJIorfdeodi9fzNtQETs56Or8SwlysmgQQQt4R
> kKDdiZaRx3Q0be579K6C4XZGyRC6JMDzDi1X6bXgBL8KYDFFA/aEyOBd+2Zrz1YpOi2aTjzyRc4d4MXJwaIGivtlXtZc6R5KypOhVP6eX1kx/qV9OwVzXAz6
> 
> **NOTE TO ISP: PLEASE FORWARD THE ENTIRE NOTICE***
> 
> Re: Unauthorized Use of Copyrights Owned Exclusively by The Bicycle Music 
> Company
> 
> Reference#: ZBP96D4  IP Address: 73.166.122.44
> 
> Dear Sir or Madam:
> .
> .
> .
> .
> .
> .
> 
> 
> Regards,
> ~Sri


Issues with Solr Morphline reading RFC822 files

2017-02-13 Thread Anatharaman, Srinatha (Contractor)
Hi,

I am loading email files which are in RFC822 format into SolrCloud using Flume
But some meta data of the emails is not getting loaded to Solr.
Please find below sample email, text which is colored in Bold Red is ignored by 
Solr
I can read this files ONLY using org.apache.tika.parser.mail.RFC822Parser 
Parser, If I want to read it using TXTparser Solr ignores the files with error 
"No supported MIME type found for _attachment_mimetype=message/rfc822"

How do I overcome this issue? I want to read the emails files without losing 
single word from the file

Received: from resqmta-po-08v.sys..net ([196.114.154.167])
by csp-imta02.westchester.pa.bo..net with bizsmtp
id EClZ1u0013cy81c01E9enp; Wed, 30 Nov 2016 14:09:38 +
Received: from resimta-po-14v.sys. .net ([96.114.154.142])
by resqmta-po-08v.sys..net with SMTP
id C5ZqcRB3e2dNjC5ZqcQvHl; Wed, 30 Nov 2016 14:09:38 +
Received: from outgoingemail1.digitalrightscorp.com ([69.36.73.150])
by resimta-po-14v.sys..net with SMTP
id C5ZNcJfg9npCYC5Zcceh9K; Wed, 30 Nov 2016 14:09:25 +
X-Xfinity-Message-Heuristics: IPv6:N;TLS=0;SPF=0;DMARC=
Received: from outgoingemail1-69-150 (localhost [127.0.0.1])
by outgoingemail1. XRightsCorp.com (Postfix) with ESMTP id 
15EB7100419
for ; Wed, 30 Nov 2016 06:05:52 -0800 (PST)
From: a...@xrightscorp.com
To: d...@.net
Message-ID: <551271522.6.1480514752082.JavaMail.root@outgoingemail1-69-150>
Subject: Unauthorized Use of Copyrights RE:
TC-cc0ae97d-8918-4a4b-8515-749ff9303bc0
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Date: Wed, 30 Nov 2016 06:05:52 -0800 (PST)
X-CMAE-Envelope: 
MS4wfAIoEnMl1VVV7nPS/7pis5Gr/ijSjTNaioaGiZVCAo4cXRoeTl9Z1Nt8SYSY4kX7RpDlZuxzGbzyeRDJIorfdeodi9fzNtQETs56Or8SwlysmgQQQt4R
kKDdiZaRx3Q0be579K6C4XZGyRC6JMDzDi1X6bXgBL8KYDFFA/aEyOBd+2Zrz1YpOi2aTjzyRc4d4MXJwaIGivtlXtZc6R5KypOhVP6eX1kx/qV9OwVzXAz6

**NOTE TO ISP: PLEASE FORWARD THE ENTIRE NOTICE***

Re: Unauthorized Use of Copyrights Owned Exclusively by The Bicycle Music 
Company

Reference#: ZBP96D4  IP Address: 73.166.122.44

Dear Sir or Madam:
.
.
.
.
.
.


Regards,
~Sri


Re: Field collapsing, facets, and qtime: caching issue?

2017-02-13 Thread Joel Bernstein
The additional work is done in the QueryComponent I believe. There is a
flag that tells the QueryComponent if the DocSet is needed. If that's set
to true and it's not available it will build the DocSet.

We ran into the facet refinement issue I mentioned at Alfresco and I
created this ticket: https://issues.apache.org/jira/browse/SOLR-8092.

Fixing this problem would likely resolve your scenario as well.

I haven't broken ground on it yet though.






Joel Bernstein
http://joelsolr.blogspot.com/

On Mon, Feb 13, 2017 at 12:52 PM, ronbraun  wrote:

> Thanks for the explanation, Joel.  When you say the query/collapse needs to
> be re-run, is this the facet component that needs to do this?  The
> confusing
> part is that the debug suggests the time is being spent in the query
> component when faceting is enabled.  My naive reading of your response
> would
> give me the expectation that by enabling facets with facet=true, the facet
> component would need to do additional work and so the qTime cost would be
> paid by that component.  Here is the debug I get for repeated hits against
> /default?indent=on=*:*=json={!collapse+field=groupid}=true&
> debugQuery=on:
>
> "process": {
> "time": 200.0,
> "query": { "time": 200.0 },
> "facet": { "time": 0.0 },
> "facet_module": { "time": 0.0 },
> "mlt": { "time": 0.0 },
> "highlight": { "time": 0.0 },
> "stats": { "time": 0.0 },
> "expand": { "time": 0.0 },
> "terms": { "time": 0.0 },
> "debug": { "time": 0.0 }
> }
>
> Or perhaps the facet component uses the query component to rerun the query
> and the time is billed to that component?
>
> Regardless, is the lack of caching a known and ticketed issue?  The
> consensus across various other solr tickets regarding grouped search seems
> to be to prefer the collapse/expand approach to grouping.  I'm using
> non-grouped search now but would like to switch to grouped and
> collapse/expand could work for my use case, but the effective defeat of
> query caching for any faceted application seems pretty problematic and I'd
> be hesitant to switch over if I'm effectively losing query caching by doing
> so.  My query cache hit rate is reasonably high.
>
> Thanks!
>
> Ron
>
>
>
>
> --
> View this message in context: http://lucene.472066.n3.
> nabble.com/Field-collapsing-facets-and-qtime-caching-
> issue-tp4319759p4320114.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>


Re: Unable to build Solr 5.5.3 from source

2017-02-13 Thread Steve Rowe
Sahil,

Dependency versions are in lucene/ivy-versions.properties.  When we upgrade, we 
change the version there instead of in each ivy.xml file with the dependency.

--
Steve
www.lucidworks.com

> On Feb 13, 2017, at 11:00 AM, Sahil Agarwal  wrote:
> 
> The issue has been fixed. Seems there is a problem in *solr/core/ivy.xml *
> 
>  rev="${/commons-fileupload/commons-fileupload}" conf="compile"/>
> 
> In this line, I replaced the ${/commons-fileupload/commons-fileupload} with
> 1.3.2 as the variable seemed to be downloading version 1.3.1 of the
> commons-fileupload instead of the latest 1.3.2 version.
> 
> Once this was done, ant built the sources successfully.
> 
> Thanks!
> Sahil
> 
> On 13 February 2017 at 19:30, Shawn Heisey  wrote:
> 
>> On 2/12/2017 11:52 PM, Sahil Agarwal wrote:
>>> I have not been able to build Solr 5.5.3 from the source.
>> 
>>> Detected Java version: 1.8 in: /usr/lib/jvm/jdk1.8.0_121/jre
>> 
>> The unresolved dependency error is unusual, I'm not really sure what's
>> going on there.  My best idea would be to delete the ivy cache entirely
>> and try again.  These would be the commands I would use, from the top
>> level of the source code:
>> 
>> rm -rf ~/.ivy2
>> ant clean clean-jars
>> 
>> This will cause ivy to re-download all dependent jars when you do the
>> compile, and if you are using ivy with any other java source code, might
>> cause some temporary issues for those builds.
>> 
>> Even if you get ivy to work right, you're going to run into another
>> problem due to the JDK version you've got.  Oracle changed the javadoc
>> compiler to be more strict in that version, which broke the build.
>> 
>> https://issues.apache.org/jira/browse/LUCENE-7651
>> 
>> The fix has been backported to the 5.5 branch, so it will be available
>> in the 5.5.4 tag when it is created.  The 5.5.3 build will continue to
>> be broken with Java 8u121.
>> 
>> You'll need to either get the branch_5_5 source code from git to build
>> 5.5.4, or downgrade your JDK version.  Alternatively, you can wait for
>> the 5.5.4 release to be available to get the source code, or get the
>> patch and apply it to your 5.5.3 code.  I do not know if the patch will
>> apply cleanly -- it may require manual work.
>> 
>> Thanks,
>> Shawn
>> 
>> 



json facet API response size

2017-02-13 Thread ahmed darweesh
I tried migrating our facet search from the old facet method to the new
json facet API, but there is one problem in the size of the returned
response. for example one query response size is around 1.2 MB while the
same query using the old facet method produces a response of around 160 KB.

Is there any way to reduce the size of the response from the json facet
response?

I'm using version 5.2 BTW.


Re: Field collapsing, facets, and qtime: caching issue?

2017-02-13 Thread ronbraun
Thanks for the explanation, Joel.  When you say the query/collapse needs to
be re-run, is this the facet component that needs to do this?  The confusing
part is that the debug suggests the time is being spent in the query
component when faceting is enabled.  My naive reading of your response would
give me the expectation that by enabling facets with facet=true, the facet
component would need to do additional work and so the qTime cost would be
paid by that component.  Here is the debug I get for repeated hits against
/default?indent=on=*:*=json={!collapse+field=groupid}=true=on:

"process": {
"time": 200.0,
"query": { "time": 200.0 },
"facet": { "time": 0.0 },
"facet_module": { "time": 0.0 },
"mlt": { "time": 0.0 },
"highlight": { "time": 0.0 },
"stats": { "time": 0.0 },
"expand": { "time": 0.0 },
"terms": { "time": 0.0 },
"debug": { "time": 0.0 }
}

Or perhaps the facet component uses the query component to rerun the query
and the time is billed to that component?

Regardless, is the lack of caching a known and ticketed issue?  The
consensus across various other solr tickets regarding grouped search seems
to be to prefer the collapse/expand approach to grouping.  I'm using
non-grouped search now but would like to switch to grouped and
collapse/expand could work for my use case, but the effective defeat of
query caching for any faceted application seems pretty problematic and I'd
be hesitant to switch over if I'm effectively losing query caching by doing
so.  My query cache hit rate is reasonably high.

Thanks!

Ron




--
View this message in context: 
http://lucene.472066.n3.nabble.com/Field-collapsing-facets-and-qtime-caching-issue-tp4319759p4320114.html
Sent from the Solr - User mailing list archive at Nabble.com.


Get orphan documents

2017-02-13 Thread Ivan Bianchi
Hi,

I want to check if I have orphan documents in my core.

I found this utility *CheckJoinIndex* in Lucene, but I really don't know
how to execute it in my Solr 5.5.3 core.

Is there a query that can gave me the orphan documents in Solr?

I have an schema like this:

{ "id": 1,
"content_type": 'A',
"_childDocuments_": [ { "id": "S_1", "content_type": "B", },


Best regards,

-- 
Ivan


Re: Upgrade SOLR version - facets perfomance regression

2017-02-13 Thread SOLR4189
I finished to write FacetConverter, but I have a question: 
How do I config facet.threads parameter in Json Facet Api? 
  

I didn't find right syntax in the Confluence page.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Upgrade-SOLR-version-facets-perfomance-regression-tp4315027p4320104.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Upgrade SOLR version - facets perfomance regression

2017-02-13 Thread SOLR4189
I finished to write FacetConverter, but I have some questions:
  1) How do I config facet.threads parameter in Json Facet Api?
  2) How do I add facet.pivot to query? For example, I need 
  *q=*:*=true=A,B*
and I tried to write something like this: 
 *q=*:*={ A_B : {type:terms, field:A,
facet:{type:terms, field:B} } }*
but I get error: wrong aggr_B field.

I didn't find right syntax in the Confluence page. 



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Upgrade-SOLR-version-facets-perfomance-regression-tp4315027p4320103.html
Sent from the Solr - User mailing list archive at Nabble.com.


Continual garbage collection loop

2017-02-13 Thread Leon STRINGER
Hi,

I get an issue where, when I'm deleting and adding Solr cores, it appears to go
into a loop increasing CPU load and continually (every 2 seconds) logging to the
garbage collection log.

I had this problem with 6.1.0 so we've just upgraded to 6.4.1 and the issue
still occurs. The entries being logged every 2 seconds are below (hope it's not
too verbose). Obviously this means the log gets big quickly.

We can work around the issue by restarting Solr but presumably something has
gone wrong. Can anyone suggest if we're doing something incorrectly to cause
this, or if it's an issue we can troubleshoot.

Any advice gratefully received.

On CentOS 7 with OpenJDK 1.8.0_91-b14.

solr_gc.log.0.current logs the following every 2 seconds:

2017-02-13T16:19:11.230+: 5092.640: [GC (CMS Initial Mark) [1
CMS-initial-mark: 225270K(393216K)] 225280K(502464K), 0.0030517 secs] [Times:
user=0.01 sys=0.00, real=0.01 secs]
2017-02-13T16:19:11.234+: 5092.643: Total time for which application threads
were stopped: 0.0033800 seconds, Stopping threads took: 0.473 seconds
2017-02-13T16:19:11.234+: 5092.643: [CMS-concurrent-mark-start]
2017-02-13T16:19:11.359+: 5092.769: [CMS-concurrent-mark: 0.125/0.125 secs]
[Times: user=0.50 sys=0.00, real=0.12 secs]
2017-02-13T16:19:11.359+: 5092.769: [CMS-concurrent-preclean-start]
2017-02-13T16:19:11.361+: 5092.771: [CMS-concurrent-preclean: 0.002/0.002
secs] [Times: user=0.00 sys=0.00, real=0.00 secs]
2017-02-13T16:19:11.362+: 5092.771: [GC (CMS Final Remark) [YG occupancy: 10
K (109248 K)]{Heap before GC invocations=3236 (full 1150):
par new generation total 109248K, used 10K [0xe000,
0xe800, 0xe800)
eden space 87424K, 0% used [0xe000, 0xe0001020,
0xe556)
from space 21824K, 0% used [0xe6ab, 0xe6ab1830,
0xe800)
to space 21824K, 0% used [0xe556, 0xe556,
0xe6ab)
concurrent mark-sweep generation total 393216K, used 225270K
[0xe800, 0x0001, 0x0001)
Metaspace used 176850K, capacity 179580K, committed 181092K, reserved 1210368K
class space used 18794K, capacity 19506K, committed 19836K, reserved 1048576K
2017-02-13T16:19:11.362+: 5092.771: [GC (CMS Final Remark)
2017-02-13T16:19:11.362+: 5092.771: [ParNew
Desired survivor size 20112992 bytes, new threshold 8 (max 8)
- age 2: 160 bytes, 160 total
- age 4: 32 bytes, 192 total
: 10K->6K(109248K), 0.0041872 secs] 225280K->225276K(502464K), 0.0042455 secs]
[Times: user=0.01 sys=0.00, real=0.01 secs]
Heap after GC invocations=3237 (full 1150):
par new generation total 109248K, used 6K [0xe000,
0xe800, 0xe800)
eden space 87424K, 0% used [0xe000, 0xe000,
0xe556)
from space 21824K, 0% used [0xe556, 0xe5561830,
0xe6ab)
to space 21824K, 0% used [0xe6ab, 0xe6ab,
0xe800)
concurrent mark-sweep generation total 393216K, used 225270K
[0xe800, 0x0001, 0x0001)
Metaspace used 176850K, capacity 179580K, committed 181092K, reserved 1210368K
class space used 18794K, capacity 19506K, committed 19836K, reserved 1048576K
}
2017-02-13T16:19:11.366+: 5092.775: [Rescan (parallel) , 0.0018980
secs]2017-02-13T16:19:11.368+: 5092.777: [weak refs processing, 0.0004940
secs]2017-02-13T16:19:11.368+: 5092.778: [class unloading, 0.0580950
secs]2017-02-13T16:19:11.426+: 5092.836: [scrub symbol table, 0.0110875
secs]2017-02-13T16:19:11.438+: 5092.847: [scrub string table, 0.0019072
secs][1 CMS-remark: 225270K(393216K)] 225276K(502464K), 0.0780250 secs] [Times:
user=0.09 sys=0.00, real=0.08 secs]
2017-02-13T16:19:11.440+: 5092.849: Total time for which application threads
were stopped: 0.0782677 seconds, Stopping threads took: 0.411 seconds
2017-02-13T16:19:11.440+: 5092.849: [CMS-concurrent-sweep-start]
2017-02-13T16:19:11.546+: 5092.955: [CMS-concurrent-sweep: 0.106/0.106 secs]
[Times: user=0.11 sys=0.00, real=0.11 secs]
2017-02-13T16:19:11.546+: 5092.955: [CMS-concurrent-reset-start]
2017-02-13T16:19:11.546+: 5092.956: [CMS-concurrent-reset: 0.001/0.001 secs]
[Times: user=0.00 sys=0.00, real=0.00 secs]

Regards,

Leon Stringer

Re: ConcurrentUpdateSolrServer update response doesn't reflect correct response status

2017-02-13 Thread Lasitha Wattaladeniya
Hi Shawn,

Thanks for the detailed explanation, really informative. However after
further analyzing I found that I can use HttpSolrServer instead of
ConcurrentUpdateSolrServer,
I'm handling the concurrency by my self. But the issue with
ConcurrentUpdateSolrServer
is still there, seem like an interesting topic to look into.

Regards,
Lasitha

Lasitha Wattaladeniya
Software Engineer

Mobile : +6593896893
Blog : techreadme.blogspot.com

On Mon, Feb 13, 2017 at 10:01 PM, Shawn Heisey  wrote:

> On 2/12/2017 10:51 PM, Lasitha Wattaladeniya wrote:
> > Thanks for the reply.  But even if I add a single document,  the response
> > doesn't reflect the correct response status.  From the given solr console
> > ui,  it shows status 400 for the bad requests (when I try to index
> document
> > with no fields ). But when I do the same using solrj,
> response
> > is 0 eventhough the console prints it as a bad request with status 400.
>
> The concurrent client swallows all indexing exceptions.  This is how it
> is designed.  In fact, it's potentially even worse than you might expect
> -- the client not only always returns a success for updates sent, it
> returns immediately, often before any updates are sent to the server.
> The updates are sent in the background.  If another request type is
> sent, like a commit, then the client will wait for all queued updates to
> be sent before it sends that request and returns.  If an error is
> encountered during an explicit commit, I believe that the client *is*
> notified about that, but I have not tested this.
>
> If you want multi-threaded indexing *and* error detection, you're going
> to need to handle the multiple threads in your own program, using the
> Http or Cloud client object instead.  The Concurrent object is good for
> initial bulk indexing, but not for the kind of indexing where you want
> to be informed about errors.
>
> I'm told that if you override the "handleError" method, you might be
> able to make it work how you want, but I was unable to figure out how to
> modify that method to achieve that result.
>
> I filed an issue and even came up with an idea for a fix, but nobody was
> excited about it.  Because of the design of the client, it wasn't
> possible to guarantee that no further indexing had occurred after the
> first error encountered.
>
> https://issues.apache.org/jira/browse/SOLR-3284
>
> Thanks,
> Shawn
>
>


Re: Unable to build Solr 5.5.3 from source

2017-02-13 Thread Sahil Agarwal
Thanks Shawn, I have already come across the javadoc compiler problem after
solving the dependency problem. I do have a workaround in mind for it, will
try to apply it tomorrow and will update if it works.

On 13 February 2017 at 21:30, Sahil Agarwal 
wrote:

> The issue has been fixed. Seems there is a problem in *solr/core/ivy.xml *
>
>  rev="${/commons-fileupload/commons-fileupload}" conf="compile"/>
>
> In this line, I replaced the ${/commons-fileupload/commons-fileupload}
> with 1.3.2 as the variable seemed to be downloading version 1.3.1 of the
> commons-fileupload instead of the latest 1.3.2 version.
>
> Once this was done, ant built the sources successfully.
>
> Thanks!
> Sahil
>
> On 13 February 2017 at 19:30, Shawn Heisey  wrote:
>
>> On 2/12/2017 11:52 PM, Sahil Agarwal wrote:
>> > I have not been able to build Solr 5.5.3 from the source.
>> 
>> > Detected Java version: 1.8 in: /usr/lib/jvm/jdk1.8.0_121/jre
>>
>> The unresolved dependency error is unusual, I'm not really sure what's
>> going on there.  My best idea would be to delete the ivy cache entirely
>> and try again.  These would be the commands I would use, from the top
>> level of the source code:
>>
>> rm -rf ~/.ivy2
>> ant clean clean-jars
>>
>> This will cause ivy to re-download all dependent jars when you do the
>> compile, and if you are using ivy with any other java source code, might
>> cause some temporary issues for those builds.
>>
>> Even if you get ivy to work right, you're going to run into another
>> problem due to the JDK version you've got.  Oracle changed the javadoc
>> compiler to be more strict in that version, which broke the build.
>>
>> https://issues.apache.org/jira/browse/LUCENE-7651
>>
>> The fix has been backported to the 5.5 branch, so it will be available
>> in the 5.5.4 tag when it is created.  The 5.5.3 build will continue to
>> be broken with Java 8u121.
>>
>> You'll need to either get the branch_5_5 source code from git to build
>> 5.5.4, or downgrade your JDK version.  Alternatively, you can wait for
>> the 5.5.4 release to be available to get the source code, or get the
>> patch and apply it to your 5.5.3 code.  I do not know if the patch will
>> apply cleanly -- it may require manual work.
>>
>> Thanks,
>> Shawn
>>
>>
>


Re: Unable to build Solr 5.5.3 from source

2017-02-13 Thread Sahil Agarwal
The issue has been fixed. Seems there is a problem in *solr/core/ivy.xml *



In this line, I replaced the ${/commons-fileupload/commons-fileupload} with
1.3.2 as the variable seemed to be downloading version 1.3.1 of the
commons-fileupload instead of the latest 1.3.2 version.

Once this was done, ant built the sources successfully.

Thanks!
Sahil

On 13 February 2017 at 19:30, Shawn Heisey  wrote:

> On 2/12/2017 11:52 PM, Sahil Agarwal wrote:
> > I have not been able to build Solr 5.5.3 from the source.
> 
> > Detected Java version: 1.8 in: /usr/lib/jvm/jdk1.8.0_121/jre
>
> The unresolved dependency error is unusual, I'm not really sure what's
> going on there.  My best idea would be to delete the ivy cache entirely
> and try again.  These would be the commands I would use, from the top
> level of the source code:
>
> rm -rf ~/.ivy2
> ant clean clean-jars
>
> This will cause ivy to re-download all dependent jars when you do the
> compile, and if you are using ivy with any other java source code, might
> cause some temporary issues for those builds.
>
> Even if you get ivy to work right, you're going to run into another
> problem due to the JDK version you've got.  Oracle changed the javadoc
> compiler to be more strict in that version, which broke the build.
>
> https://issues.apache.org/jira/browse/LUCENE-7651
>
> The fix has been backported to the 5.5 branch, so it will be available
> in the 5.5.4 tag when it is created.  The 5.5.3 build will continue to
> be broken with Java 8u121.
>
> You'll need to either get the branch_5_5 source code from git to build
> 5.5.4, or downgrade your JDK version.  Alternatively, you can wait for
> the 5.5.4 release to be available to get the source code, or get the
> patch and apply it to your 5.5.3 code.  I do not know if the patch will
> apply cleanly -- it may require manual work.
>
> Thanks,
> Shawn
>
>


Re: Heads up: SOLR-10130, Performance issue in Solr 6.4.1

2017-02-13 Thread Walter Underwood
I’m seeing similar problems here. With 6.4.0, we were handling 6000 
requests/minute. With 6.4.1 it is 1000 rpm with median response times around 
2.5 seconds. I also switched to the G1 collector. I’m going to back that out 
and retest today to see if the performance comes back.

wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/  (my blog)


> On Feb 13, 2017, at 4:46 AM, Ere Maijala  wrote:
> 
> Hi all,
> 
> this is just a quick heads-up that we've stumbled on serious performance 
> issues after upgrading to Solr 6.4.1 apparently due to the new metrics 
> collection causing a major slowdown. I've filed an issue 
> (https://issues.apache.org/jira/browse/SOLR-10130) about it, but decided to 
> post this just so that anyone else doesn't need to encounter this unprepared. 
> It seems to me that metrics would need to be explicitly disabled altogether 
> in the index config to avoid the issue.
> 
> --Ere



Re: Java version set to 1.8 for SOLR 6.4.0

2017-02-13 Thread Uchit Patel
Hi Shawn,
I follwed following steps:
Stopped the existing SOLR 5.1.0 by bin/solr stop -p 8983
I have copied solr-6.4.0.tgz file to the root directory -/opt/wml.


Then I have extracted (installed) by 
tarzxf solr-6.4.0.tgz

Let me know what's wrong.


Thanks.
Regards,
Uchit Patel



  From: Shawn Heisey 
 To: solr-user@lucene.apache.org 
 Sent: Monday, February 13, 2017 7:30 PM
 Subject: Re: Java version set to 1.8 for SOLR 6.4.0
   
On 2/13/2017 3:13 AM, Uchit Patel wrote:
> I have updated SOLR_JAVA_HOME in following file.
> /opt/wml/solr-6.4.0/bin/solr.in.sh SOLR_JAVA_HOME =
> "/opt/wml/jdk1.8.0_66/jre/bin/java"  But it is not working. 

If you *installed* Solr using the service installer script, then that is
not the correct file to update.  The service installer has been putting
the include script in /etc/default for some time.  Back in earlier 5.5
versions, the file would end up in the solr home, which has a default
location of /var/solr.

Thanks,
Shawn



   

Re: how to get modified field data if it doesn't exist in meta

2017-02-13 Thread Gytis Mikuciunas
Hi,

Who can compile me this to jar file? (I found something similar i need in
google: (
http://stackoverflow.com/questions/20745935/set-last-modified-field-when-not-defined-in-document-in-solr
))

package modifiedG4;

import java.io.IOException;
import org.apache.solr.common.SolrInputDocument;
import org.apache.solr.request.SolrQueryRequest;
import org.apache.solr.response.SolrQueryResponse;
import org.apache.solr.update.AddUpdateCommand;
import org.apache.solr.update.processor.UpdateRequestProcessor;
import org.apache.solr.update.processor.UpdateRequestProcessorFactory;

public class LastModifiedMergeProcessorFactory
   extends UpdateRequestProcessorFactory {

  @Override
  public UpdateRequestProcessor getInstance(SolrQueryRequest req,
   SolrQueryResponse rsp, UpdateRequestProcessor next) {
return new LastModifiedMergeProcessor(next);
  }
}

class LastModifiedMergeProcessor extends UpdateRequestProcessor {

  public LastModifiedMergeProcessor(UpdateRequestProcessor next) {
super(next);
  }

  @Override
  public void processAdd(AddUpdateCommand cmd) throws IOException {
SolrInputDocument doc = cmd.getSolrInputDocument();

Object metaDate = doc.getFieldValue( "last_modified" );
Object fileDate = doc.getFieldValue( "file_date" );
if( metaDate == null && fileDate != null) {
doc.addField( "last_modified", fileDate );
}

  // pass it up the chain
  super.processAdd(cmd);
}
}

On Sun, Feb 12, 2017 at 8:45 PM, Alexandre Rafalovitch 
wrote:

> It would have to be a custom one. One you write. But I believe Tika
> would pass a file name as one of the parameters, so you just need to
> use standard Java API to look up the system date. That - of course -
> assumes that the files you index are on the same filesystem as Solr
> itself, so it could look it up.
>
> You can find more about the UPRs at:
> https://cwiki.apache.org/confluence/display/solr/Update+Request+Processors
> You can find the full list of the URPs at:
> http://www.solr-start.com/info/update-request-processors/
> If you are on the latest Solr 6.4, you would probably want to subclass
> SimpleUpdateProcessorFactory and follow the implementation example of
> TemplateUpdateProcessorFactory
> https://github.com/apache/lucene-solr/blob/releases/
> lucene-solr/6.4.0/solr/core/src/java/org/apache/solr/update/processor/
> TemplateUpdateProcessorFactory.java
>
> Alternatively, you could implement your URP in Javascript, but I am
> not sure that has an API to check file dates.
>
> Regards,
>Alex.
> 
> http://www.solr-start.com/ - Resources for Solr users, new and experienced
>
>
> On 12 February 2017 at 13:28, Gytis Mikuciunas  wrote:
> > Alexandre, could you provide some link or give more info about this
> > processor?
> > I'm novice in the solr world;)
> >
> >
> > Regards,
> > Gytis
> >
> > On Feb 10, 2017 14:59, "Alexandre Rafalovitch" 
> wrote:
> >
> > Custom update request processor that looks up a file from the name and
> gets
> > the date should work.
> >
> > Regards,
> > Alex
> >
> > On 10 Feb 2017 2:39 AM, "Gytis Mikuciunas"  wrote:
> >
> > Hi,
> >
> > We have started to use solr for our documents indexing (vsd, vsdx,
> > xls,xlsx, doc, docx, pdf, txt).
> >
> > Modified date values is needed for each file. MS Office's files, pdfs
> have
> > this value.
> > Problem is with txt files as they don't have this value in their meta.
> >
> > Is there any possibility to get it somehow from os level and force adding
> > it to solr when we do indexing.
> >
> > p.s.
> >
> > Windows 2012 server, single instance
> >
> > typical command we use: java -Dauto -Dc=index_sandbox -Dport=80
> > -Dfiletypes=vsd,vsdx,xls,xlsx,doc,docx,pdf,txt -Dbasicauth=admin:
> -jar
> > example/exampledocs/post.jar "M:\DNS_dump"
> >
> >
> > Regards,
> >
> > Gytis
>


Re: ConcurrentUpdateSolrServer update response doesn't reflect correct response status

2017-02-13 Thread Shawn Heisey
On 2/12/2017 10:51 PM, Lasitha Wattaladeniya wrote:
> Thanks for the reply.  But even if I add a single document,  the response
> doesn't reflect the correct response status.  From the given solr console
> ui,  it shows status 400 for the bad requests (when I try to index document
> with no fields ). But when I do the same using solrj,  response
> is 0 eventhough the console prints it as a bad request with status 400.

The concurrent client swallows all indexing exceptions.  This is how it
is designed.  In fact, it's potentially even worse than you might expect
-- the client not only always returns a success for updates sent, it
returns immediately, often before any updates are sent to the server. 
The updates are sent in the background.  If another request type is
sent, like a commit, then the client will wait for all queued updates to
be sent before it sends that request and returns.  If an error is
encountered during an explicit commit, I believe that the client *is*
notified about that, but I have not tested this.

If you want multi-threaded indexing *and* error detection, you're going
to need to handle the multiple threads in your own program, using the
Http or Cloud client object instead.  The Concurrent object is good for
initial bulk indexing, but not for the kind of indexing where you want
to be informed about errors.

I'm told that if you override the "handleError" method, you might be
able to make it work how you want, but I was unable to figure out how to
modify that method to achieve that result.

I filed an issue and even came up with an idea for a fix, but nobody was
excited about it.  Because of the design of the client, it wasn't
possible to guarantee that no further indexing had occurred after the
first error encountered.

https://issues.apache.org/jira/browse/SOLR-3284

Thanks,
Shawn



Re: Java version set to 1.8 for SOLR 6.4.0

2017-02-13 Thread Shawn Heisey
On 2/13/2017 3:13 AM, Uchit Patel wrote:
> I have updated SOLR_JAVA_HOME in following file.
> /opt/wml/solr-6.4.0/bin/solr.in.sh SOLR_JAVA_HOME =
> "/opt/wml/jdk1.8.0_66/jre/bin/java"  But it is not working. 

If you *installed* Solr using the service installer script, then that is
not the correct file to update.  The service installer has been putting
the include script in /etc/default for some time.  Back in earlier 5.5
versions, the file would end up in the solr home, which has a default
location of /var/solr.

Thanks,
Shawn



Re: Unable to build Solr 5.5.3 from source

2017-02-13 Thread Shawn Heisey
On 2/12/2017 11:52 PM, Sahil Agarwal wrote:
> I have not been able to build Solr 5.5.3 from the source.

> Detected Java version: 1.8 in: /usr/lib/jvm/jdk1.8.0_121/jre

The unresolved dependency error is unusual, I'm not really sure what's
going on there.  My best idea would be to delete the ivy cache entirely
and try again.  These would be the commands I would use, from the top
level of the source code:

rm -rf ~/.ivy2
ant clean clean-jars

This will cause ivy to re-download all dependent jars when you do the
compile, and if you are using ivy with any other java source code, might
cause some temporary issues for those builds.

Even if you get ivy to work right, you're going to run into another
problem due to the JDK version you've got.  Oracle changed the javadoc
compiler to be more strict in that version, which broke the build.

https://issues.apache.org/jira/browse/LUCENE-7651

The fix has been backported to the 5.5 branch, so it will be available
in the 5.5.4 tag when it is created.  The 5.5.3 build will continue to
be broken with Java 8u121.

You'll need to either get the branch_5_5 source code from git to build
5.5.4, or downgrade your JDK version.  Alternatively, you can wait for
the 5.5.4 release to be available to get the source code, or get the
patch and apply it to your 5.5.3 code.  I do not know if the patch will
apply cleanly -- it may require manual work.

Thanks,
Shawn



Re: bin/post and self-signed SSL

2017-02-13 Thread Jan Høydahl
Thanks for your answers. I was also able to work around it using cURL, but
we should obviously fix bin/post to be as smart as bin/solr in parsing 
env.variables
related to SSL and auth.

--
Jan Høydahl, search solution architect
Cominvent AS - www.cominvent.com

> 7. feb. 2017 kl. 01.23 skrev Kevin Risden :
> 
> I expect that the commands work the same or very close from 5.5.x through
> 6.4.x. There have been some cleaning up of the bin/solr and bin/post
> commands but not many security changes. If you find differently then please
> let us know.
> 
> Kevin Risden
> 
> On Feb 5, 2017 21:02, "alias" <524839...@qq.com> wrote:
> 
>> You mean this can only be used in this version 5.5.x? Other versions
>> invalid?
>> 
>> 
>> 
>> 
>> -- 原始邮件 --
>> 发件人: "Kevin Risden";;
>> 发送时间: 2017年2月6日(星期一) 上午9:44
>> 收件人: "solr-user";
>> 
>> 主题: Re: bin/post and self-signed SSL
>> 
>> 
>> 
>> Originally formatted as MarkDown. This was tested against Solr 5.5.x
>> packaged as Lucidworks HDP Search. It would be the same as Solr 5.5.x.
>> 
>> # Using Solr
>> *
>> https://cwiki.apache.org/confluence/display/solr/Solr+
>> Start+Script+Reference
>> * https://cwiki.apache.org/confluence/display/solr/Running+Solr
>> * https://cwiki.apache.org/confluence/display/solr/Collections+API
>> 
>> ## Create collection (w/o Kerberos)
>> ```bash
>> /opt/lucidworks-hdpsearch/solr/bin/solr create -c test
>> ```
>> 
>> ## Upload configuration directory (w/ SSL and Kerberos)
>> ```bash
>> /opt/lucidworks-hdpsearch/solr/server/scripts/cloud-scripts/zkcli.sh
>> -zkhost ZK_CONNECTION_STRING -cmd upconfig -confname basic_config -confdir
>> /opt/lucidworks-hdpsearch/solr/server/solr/configsets/basic_configs/conf
>> ```
>> 
>> ## Create Collection (w/ SSL and Kerberos)
>> ```bash
>> curl -k --negotiate -u : "
>> https://SOLR_HOST:8983/solr/admin/collections?action=
>> CREATE=newCollection=1=
>> 1=basic_config
>> "
>> ```
>> 
>> ## Delete collection (w/o Kerberos)
>> ```bash
>> /opt/lucidworks-hdpsearch/solr/bin/solr delete -c test
>> ```
>> 
>> ## Delete Collection (w/ SSL and Kerberos)
>> ```bash
>> curl -k --negotiate -u : "
>> https://SOLR_HOST:8983/solr/admin/collections?action=
>> DELETE=newCollection
>> "
>> ```
>> 
>> ## Adding some test docs (w/o SSL)
>> ```bash
>> /opt/lucidworks-hdpsearch/solr/bin/post -c test
>> /opt/lucidworks-hdpsearch/solr/example/exampledocs/*.xml
>> ```
>> 
>> ## Adding documents (w/ SSL and Kerberos)
>> ```bash
>> curl -k --negotiate -u : "
>> https://SOLR_HOST:8983/solr/newCollection/update?commit=true; -H
>> "Content-Type: application/json" --data-binary
>> @/opt/lucidworks-hdpsearch/solr/example/exampledocs/books.json
>> ```
>> 
>> ## List Collections (w/ SSL and Kerberos)
>> ```bash
>> curl -k --negotiate -u : "
>> https://SOLR_HOST:8983/solr/admin/collections?action=LIST;
>> ```
>> 
>> Kevin Risden
>> 
>> On Sun, Feb 5, 2017 at 5:55 PM, Kevin Risden 
>> wrote:
>> 
>>> Last time I looked at this, there was no way to pass any Java properties
>>> to the bin/post command. This made it impossible to even set the SSL
>>> properties manually. I checked master just now and still there is no
>> place
>>> to enter Java properties that would make it to the Java command.
>>> 
>>> I came up with a chart of commands previously that worked with standard
>>> (no SSL or Kerberos), SSL only, and SSL with Kerberos. Only the standard
>>> solr setup worked for the bin/solr and bin/post commands. Errors popped
>> up
>>> that I couldn't work around. I've been meaning to get back to it just
>>> haven't had a chance.
>>> 
>>> I'll try to share that info when I get back to my laptop.
>>> 
>>> Kevin Risden
>>> 
>>> On Feb 5, 2017 12:31, "Jan Høydahl"  wrote:
>>> 
 Hi,
 
 I’m trying to post a document to Solr using bin/post after enabling SSL
 with self signed certificate. Result is:
 
 $ post -url https://localhost:8983/solr/sslColl *.html
 /usr/lib/jvm/java-8-openjdk-amd64/bin/java -classpath
 /opt/solr/dist/solr-core-6.4.0.jar -Dauto=yes -Durl=
 https://localhost:8983/solr/sslColl -Dc= -Ddata=files
 org.apache.solr.util.SimplePostTool lab-index.html lab-ops1.html
 lab-ops2.html lab-ops3.html lab-ops4.html lab-ops6.html lab-ops8.html
 SimplePostTool version 5.0.0
 Posting files to [base] url https://localhost:8983/solr/sslColl...
 Entering auto mode. File endings considered are
 xml,json,jsonl,csv,pdf,doc,docx,ppt,pptx,xls,xlsx,odt,odp,
 ods,ott,otp,ots,rtf,htm,html,txt,log
 POSTing file lab-index.html (text/html) to [base]/extract
 SimplePostTool: FATAL: Connection error (is Solr running at
 https://localhost:8983/solr/sslColl ?): javax.net.ssl.
>> SSLHandshakeException:
 sun.security.validator.ValidatorException: PKIX path building failed:
 sun.security.provider.certpath.SunCertPathBuilderException: 

Heads up: SOLR-10130, Performance issue in Solr 6.4.1

2017-02-13 Thread Ere Maijala

Hi all,

this is just a quick heads-up that we've stumbled on serious performance 
issues after upgrading to Solr 6.4.1 apparently due to the new metrics 
collection causing a major slowdown. I've filed an issue 
(https://issues.apache.org/jira/browse/SOLR-10130) about it, but decided 
to post this just so that anyone else doesn't need to encounter this 
unprepared. It seems to me that metrics would need to be explicitly 
disabled altogether in the index config to avoid the issue.


--Ere


Re: Unable to build Solr 5.5.3 from source

2017-02-13 Thread Steve Rowe
Hi Sahil,

I downloaded the Solr 5.5.3 source, deleted my Ivy cache, and successfully ran 
“ant compile” from the solr/ directory.

My Ant version is the same as yours.

Do you have ivy-2.3.0.jar in your ~/.ant/lib/ directory?  (I do.)

Are you attempting to compile the unmodified released source, or have you made 
modifications?

AFAICT, those are warnings, not errors - can you post the full output somewhere 
and give the link?

--
Steve
www.lucidworks.com

> On Feb 13, 2017, at 1:52 AM, Sahil Agarwal  wrote:
> 
> I have not been able to build Solr 5.5.3 from the source.
> 
> - I was able to build Solr 6.4 successfully but haven't been able to build 
> solr
> 5.5.3 (which I need) successfully.
> - I have tried deleting the cache and building again. Same errors.
> 
> I've been getting unresolved dependencies error. I get the following output
> when using ant compile -v
> 
> 
> Apache Ant(TM) version 1.9.6 compiled on July 8 2015
> Trying the default build file: build.xml
> Buildfile: /home/sahil/Work/solr/solr-5.5.3/build.xml
> Detected Java version: 1.8 in: /usr/lib/jvm/jdk1.8.0_121/jre
> Detected OS: Linux
> parsing buildfile /home/sahil/Work/solr/solr-5.5.3/build.xml with URI =
> file:/home/sahil/Work/solr/solr-5.5.3/build.xml
> Project base dir set to: /home/sahil/Work/solr/solr-5.5.3
> parsing buildfile
> jar:file:/usr/share/ant/lib/ant.jar!/org/apache/tools/ant/antlib.xml
> with URI = 
> jar:file:/usr/share/ant/lib/ant.jar!/org/apache/tools/ant/antlib.xml
> from a zip file
> Importing file /home/sahil/Work/solr/solr-5.5.3/lucene/common-build.xml
> from /home/sahil/Work/solr/solr-5.5.3/build.xml
> Overriding previous definition of reference to ant.projectHelper
> parsing buildfile /home/sahil/Work/solr/solr-5.5.3/lucene/common-build.xml
> with URI = file:/home/sahil/Work/solr/solr-5.5.3/lucene/common-build.xml
> 
> .
> .
> .
> // Deleted to make file smaller. Please tell if anything is needed.
> .
> .
> .
> 
> resolve:
> [ivy:retrieve] no resolved descriptor found: launching default resolve
> Overriding previous definition of property "ivy.version"
> [ivy:retrieve] using ivy parser to parse file:/home/sahil/Work/solr/
> solr-5.5.3/solr/core/ivy.xml
> [ivy:retrieve] :: resolving dependencies :: org.apache.solr#core;working@
> D-PGB7YZ
> [ivy:retrieve]  confs: [compile, compile.hadoop]
> [ivy:retrieve]  validate = true
> [ivy:retrieve]  refresh = false
> [ivy:retrieve] resolving dependencies for configuration 'compile'
> [ivy:retrieve] == resolving dependencies for
> org.apache.solr#core;working@D-PGB7YZ
> [compile]
> [ivy:retrieve] == resolving dependencies org.apache.solr#core;working@
> D-PGB7YZ->commons-codec#commons-codec;1.10 [compile->master]
> [ivy:retrieve] default: Checking cache for: dependency:
> commons-codec#commons-codec;1.10 {compile=[master]}
> [ivy:retrieve] don't use cache for commons-codec#commons-codec;1.10:
> checkModified=true
> [ivy:retrieve]  tried /home/sahil/.ivy2/local/commons-codec/commons-codec/1.
> 10/ivys/ivy.xml
> [ivy:retrieve]  tried /home/sahil/.ivy2/local/commons-codec/commons-codec/1.
> 10/jars/commons-codec.jar
> [ivy:retrieve]  local: no ivy file nor artifact found for
> commons-codec#commons-codec;1.10
> [ivy:retrieve] main: Checking cache for: dependency:
> commons-codec#commons-codec;1.10 {compile=[master]}
> [ivy:retrieve] main: module revision found in cache:
> commons-codec#commons-codec;1.10
> [ivy:retrieve]  found commons-codec#commons-codec;1.10 in public
> [ivy:retrieve] == resolving dependencies org.apache.solr#core;working@
> D-PGB7YZ->org.apache.commons#commons-exec;1.3 [compile->master]
> [ivy:retrieve] default: Checking cache for: dependency:
> org.apache.commons#commons-exec;1.3 {compile=[master]}
> [ivy:retrieve] don't use cache for org.apache.commons#commons-exec;1.3:
> checkModified=true
> [ivy:retrieve]  tried /home/sahil/.ivy2/local/org.
> apache.commons/commons-exec/1.3/ivys/ivy.xml
> [ivy:retrieve]  tried /home/sahil/.ivy2/local/org.
> apache.commons/commons-exec/1.3/jars/commons-exec.jar
> [ivy:retrieve]  local: no ivy file nor artifact found for
> org.apache.commons#commons-exec;1.3
> [ivy:retrieve] main: Checking cache for: dependency:
> org.apache.commons#commons-exec;1.3 {compile=[master]}
> [ivy:retrieve] main: module revision found in cache:
> org.apache.commons#commons-exec;1.3
> [ivy:retrieve]  found org.apache.commons#commons-exec;1.3 in public
> [ivy:retrieve] == resolving dependencies org.apache.solr#core;working@
> D-PGB7YZ->commons-fileupload#commons-fileupload;1.3.1 [compile->master]
> [ivy:retrieve] default: Checking cache for: dependency:
> commons-fileupload#commons-fileupload;1.3.1 {compile=[master]}
> [ivy:retrieve] don't use cache for 
> commons-fileupload#commons-fileupload;1.3.1:
> checkModified=true
> [ivy:retrieve]  local: revision in cache: commons-fileupload#commons-
> fileupload;1.3.1
> [ivy:retrieve]  found commons-fileupload#commons-fileupload;1.3.1 in local
> [ivy:retrieve] == 

Re: Java version set to 1.8 for SOLR 6.4.0

2017-02-13 Thread Sahil
Hi,

Maybe these links might be helpful to you.

http://askubuntu.com/questions/740757/switch-between-multiple-java-versions
  

http://lj4newbies.blogspot.in/2007/04/2-jvm-on-one-linux-box.html
  

Regards
Sahil



-
---
Sahil Agarwal
--
View this message in context: 
http://lucene.472066.n3.nabble.com/SOLR-version-upgrade-from-5-1-0-to-6-4-0-documentation-tp4319173p4320031.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Java version set to 1.8 for SOLR 6.4.0

2017-02-13 Thread Uchit Patel

I have updated SOLR_JAVA_HOME in following file.
/opt/wml/solr-6.4.0/bin/solr.in.sh
SOLR_JAVA_HOME = "/opt/wml/jdk1.8.0_66/jre/bin/java" 
But it is not working.
Regards,
Uchit Patel
I have installed SOLR 6.4.0 on Linux box. I have Java 1.7.0 and 1.8.0 both on 
the box. By default it point to 1.7.0. Some other applications using 1.7.0 
Java. I want to set Java 1.8.0 only for SOLR 6.4.0. What should I need to 
update for only SOLR 6.4.0 to hit Java 1.8.0. I don't want to remove Java 1.7.0 
because some other applications using Java 1.7.0.
Thanks.
Regards,
Uchit Patel

  From: Uchit Patel 
 To: "gene...@lucene.apache.org" ; 
"solr-user@lucene.apache.org" ; 
"jan@cominvent.com"  
 Sent: Monday, February 13, 2017 3:38 PM
 Subject: Re: Java version set to 1.8 for SOLR 6.4.0
   
Hi ,
I tried SOLR_JAVA_HOME = "/opt/wml/jdk1.8.0_66/jre/bin/java" but it is not 
working.
Regards,
Uchit Patel
I have installed SOLR 6.4.0 on Linux box. I have Java 1.7.0 and 1.8.0 both on 
the box. By default it point to 1.7.0. Some other applications using 1.7.0 
Java. I want to set Java 1.8.0 only for SOLR 6.4.0. What should I need to 
update for only SOLR 6.4.0 to hit Java 1.8.0. I don't want to remove Java 1.7.0 
because some other applications using Java 1.7.0.
Thanks.
Regards,
Uchit Patel


 




   

   

Re: Java version set to 1.8 for SOLR 6.4.0

2017-02-13 Thread Uchit Patel
Hi ,
I tried SOLR_JAVA_HOME = "/opt/wml/jdk1.8.0_66/jre/bin/java" but it is not 
working.
Regards,
Uchit Patel
I have installed SOLR 6.4.0 on Linux box. I have Java 1.7.0 and 1.8.0 both on 
the box. By default it point to 1.7.0. Some other applications using 1.7.0 
Java. I want to set Java 1.8.0 only for SOLR 6.4.0. What should I need to 
update for only SOLR 6.4.0 to hit Java 1.8.0. I don't want to remove Java 1.7.0 
because some other applications using Java 1.7.0.
Thanks.
Regards,
Uchit Patel


 




   

RE: Simulating group.facet for JSON facets, high mem usage w/ sorting on aggregation...

2017-02-13 Thread Bryant, Michael
Thanks for letting me know Yonik, I'll watch this issue with interest.

BTW, I said Solr 4.6.1 in my original post - that should've been 6.4.1.

Cheers,
~Mike

From: Yonik Seeley [ysee...@gmail.com]
Sent: 10 February 2017 21:44
To: solr-user@lucene.apache.org
Subject: Re: Simulating group.facet for JSON facets, high mem usage w/ sorting 
on aggregation...

FYI, I just opened 
https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fissues.apache.org%2Fjira%2Fbrowse%2FSOLR-10122=01%7C01%7Cmichael.bryant%40kcl.ac.uk%7C0c5a8ff25fe5427a978c08d451fe0df9%7C8370cf1416f34c16b83c724071654356%7C0=jfzd2uMZr5DPOy6FeFMZuV4P3%2B4l1ImhQjjl9i0hvOA%3D=0
 for this
-Yonik

On Fri, Feb 10, 2017 at 4:32 PM, Yonik Seeley  wrote:
> On Thu, Feb 9, 2017 at 6:58 AM, Bryant, Michael
>  wrote:
>> Hi all,
>>
>> I'm converting my legacy facets to JSON facets and am seeing much better 
>> performance, especially with high cardinality facet fields. However, the one 
>> issue I can't seem to resolve is excessive memory usage (and OOM errors) 
>> when trying to simulate the effect of "group.facet" to sort facets according 
>> to a grouping field.
>
> Yeah, I sort of expected this... but haven't gotten around to
> implementing something that takes less memory yet.
> If you're faceting on A and sorting by unique(B), then memory use is
> O(cardinality(A)*cardinality(B))
> We can definitely do a lot better.
>
> -Yonik