Live publishing and solr performance optimization

2018-11-19 Thread Balanathagiri Ayyasamypalanivel
Hi,
We are in the process for live Publishing document in solr and the same
time we have to maintain the search performance.

Total existing docs : 120 million
Expected data for live publishing : 1 million

For every 1 hour, we will get 1m docs to publish in live to the hot solr
collection, can you please provide your suggestions on how effectively we
can do this.

Regards,
Bala.


Re: Json object values in solr string field

2018-09-27 Thread Balanathagiri Ayyasamypalanivel
Thanks Alex/Shawn,

Yeah currently we handling by writing some custom code from the response
and calculating the assets, but we lossing the power of default stats and
facet features when going with this approach.

Also actually it's not duplicate data, but as per our current design the
data resides like 2 docs for one account that we are planning to compress
at the same time need to use stats and facet. I know it's quite complicated
if we need to achieve both at the same time, i thinking about it how to
solve.

On Thu, Sep 27, 2018, 11:19 AM Alexandre Rafalovitch 
wrote:

> If the duplicate data is only indexed, it is not actually duplicated. It is
> only an index entry and the record ids where it shows.
>
> Regards,
> Alex
>
> On Thu, Sep 27, 2018, 10:55 AM Balanathagiri Ayyasamypalanivel, <
> bala.cit...@gmail.com> wrote:
>
> > Hi Alex, thanks, we have that set up already in place, we are thinking to
> > optimize more to resign the data to avoid these duplication.
> >
> > Regards,
> > Bala.
> >
> > On Thu, Sep 27, 2018, 10:31 AM Alexandre Rafalovitch  >
> > wrote:
> >
> > > Well, my feeling is that you are going in the wrong direction. And that
> > > maybe you need to focus more on separating your - non solr - storage
> > > representation and your - solr - search oriented representation.
> > >
> > > E.g. if your issue is storage, maybe you can focus on stored=false
> > > indexed=true approach.
> > >
> > > Regards,
> > > Alex
> > >
> > > On Thu, Sep 27, 2018, 10:13 AM Balanathagiri Ayyasamypalanivel, <
> > > bala.cit...@gmail.com> wrote:
> > >
> > > > Any suggestions?
> > > > Regards,
> > > > Bala.
> > > >
> > > > On Wed, Sep 26, 2018, 2:46 PM Balanathagiri Ayyasamypalanivel <
> > > > bala.cit...@gmail.com> wrote:
> > > >
> > > > > Hi,
> > > > >
> > > > > Thanks for the reply, actually we are planning to optimize the huge
> > > > volume
> > > > > of data.
> > > > >
> > > > > For example, in our current system we have as below, so we can do
> > facet
> > > > > pivot or stats to get the sum of asset_td for each acct, but the
> data
> > > > > growing lot whenever more asset getting added.
> > > > >
> > > > > Id | Accts| assetid | asset_td
> > > > > 1| Acct1 | asset1 | 20
> > > > > 2| Acct1 | asset2 | 30
> > > > > 3| Acct2 | asset3 | 10
> > > > > 4| Acct3 | asset2 | 10
> > > > >
> > > > > So we planned to change as
> > > > >
> > > > > Id | Accts | asset_s
> > > > > 1  | Acct1 | [{"asset1": "20", "asset2":"30"}]
> > > > > 2  | Acct2 | [{"asset3": "10"}]
> > > > > 3  | Acct3 | [{"asset2": "10"}]
> > > > >
> > > > > But only draw back here is we have to parse the json to do the sum
> of
> > > the
> > > > > values, is there any other way to handle this scenario.
> > > > >
> > > > > Regards,
> > > > > Bala.
> > > > >
> > > > > On Wed, Sep 26, 2018, 2:25 PM Shawn Heisey 
> > > wrote:
> > > > >
> > > > >> On 9/26/2018 12:20 PM, Balanathagiri Ayyasamypalanivel wrote:
> > > > >> > Currently I am storing json object type of values in string
> field
> > in
> > > > >> solr.
> > > > >> > Using this field, in the code I am parsing json objects and
> doing
> > > sum
> > > > of
> > > > >> > the values under it.
> > > > >> >
> > > > >> > In solr, do we have any option in doing it by default when using
> > the
> > > > >> json
> > > > >> > object field values.
> > > > >>
> > > > >> Even if you have JSON-formatted strings in Solr, Solr doesn't know
> > > > >> this.  It has no idea that the data is JSON, and won't be able to
> do
> > > > >> anything special with the info contained there.
> > > > >>
> > > > >> Thanks,
> > > > >> Shawn
> > > > >>
> > > > >>
> > > >
> > >
> >
>


Re: Json object values in solr string field

2018-09-27 Thread Balanathagiri Ayyasamypalanivel
Hi Alex, thanks, we have that set up already in place, we are thinking to
optimize more to resign the data to avoid these duplication.

Regards,
Bala.

On Thu, Sep 27, 2018, 10:31 AM Alexandre Rafalovitch 
wrote:

> Well, my feeling is that you are going in the wrong direction. And that
> maybe you need to focus more on separating your - non solr - storage
> representation and your - solr - search oriented representation.
>
> E.g. if your issue is storage, maybe you can focus on stored=false
> indexed=true approach.
>
> Regards,
> Alex
>
> On Thu, Sep 27, 2018, 10:13 AM Balanathagiri Ayyasamypalanivel, <
> bala.cit...@gmail.com> wrote:
>
> > Any suggestions?
> > Regards,
> > Bala.
> >
> > On Wed, Sep 26, 2018, 2:46 PM Balanathagiri Ayyasamypalanivel <
> > bala.cit...@gmail.com> wrote:
> >
> > > Hi,
> > >
> > > Thanks for the reply, actually we are planning to optimize the huge
> > volume
> > > of data.
> > >
> > > For example, in our current system we have as below, so we can do facet
> > > pivot or stats to get the sum of asset_td for each acct, but the data
> > > growing lot whenever more asset getting added.
> > >
> > > Id | Accts| assetid | asset_td
> > > 1| Acct1 | asset1 | 20
> > > 2| Acct1 | asset2 | 30
> > > 3| Acct2 | asset3 | 10
> > > 4| Acct3 | asset2 | 10
> > >
> > > So we planned to change as
> > >
> > > Id | Accts | asset_s
> > > 1  | Acct1 | [{"asset1": "20", "asset2":"30"}]
> > > 2  | Acct2 | [{"asset3": "10"}]
> > > 3  | Acct3 | [{"asset2": "10"}]
> > >
> > > But only draw back here is we have to parse the json to do the sum of
> the
> > > values, is there any other way to handle this scenario.
> > >
> > > Regards,
> > > Bala.
> > >
> > > On Wed, Sep 26, 2018, 2:25 PM Shawn Heisey 
> wrote:
> > >
> > >> On 9/26/2018 12:20 PM, Balanathagiri Ayyasamypalanivel wrote:
> > >> > Currently I am storing json object type of values in string field in
> > >> solr.
> > >> > Using this field, in the code I am parsing json objects and doing
> sum
> > of
> > >> > the values under it.
> > >> >
> > >> > In solr, do we have any option in doing it by default when using the
> > >> json
> > >> > object field values.
> > >>
> > >> Even if you have JSON-formatted strings in Solr, Solr doesn't know
> > >> this.  It has no idea that the data is JSON, and won't be able to do
> > >> anything special with the info contained there.
> > >>
> > >> Thanks,
> > >> Shawn
> > >>
> > >>
> >
>


Re: Json object values in solr string field

2018-09-27 Thread Balanathagiri Ayyasamypalanivel
Thanks Shawn for your prompt response.
Actually we have to filter on the query time while calculate the score.

The challenge here is we should not add the asset and put as static field
in the index time. The asset needs to be calculated while query time with
some filters.

Regards,
Bala.

On Thu, Sep 27, 2018, 10:35 AM Shawn Heisey  wrote:

> On 9/26/2018 12:46 PM, Balanathagiri Ayyasamypalanivel wrote:
> > But only draw back here is we have to parse the json to do the sum of the
> > values, is there any other way to handle this scenario.
>
> Solr cannot do that for you.  You could put this in your indexing
> software -- add up the numbers and put the result into a new field in
> your Solr document, so that the information is already in the index when
> you do your query.  This could be done with a custom Update Processor (a
> Solr plugin that you would need to write), but if you already have
> custom indexing software, it's probably easier to simply change that
> software than to try and write a plugin.
>
> Thanks,
> Shawn
>
>


Re: Json object values in solr string field

2018-09-27 Thread Balanathagiri Ayyasamypalanivel
Any suggestions?
Regards,
Bala.

On Wed, Sep 26, 2018, 2:46 PM Balanathagiri Ayyasamypalanivel <
bala.cit...@gmail.com> wrote:

> Hi,
>
> Thanks for the reply, actually we are planning to optimize the huge volume
> of data.
>
> For example, in our current system we have as below, so we can do facet
> pivot or stats to get the sum of asset_td for each acct, but the data
> growing lot whenever more asset getting added.
>
> Id | Accts| assetid | asset_td
> 1| Acct1 | asset1 | 20
> 2| Acct1 | asset2 | 30
> 3| Acct2 | asset3 | 10
> 4| Acct3 | asset2 | 10
>
> So we planned to change as
>
> Id | Accts | asset_s
> 1  | Acct1 | [{"asset1": "20", "asset2":"30"}]
> 2  | Acct2 | [{"asset3": "10"}]
> 3  | Acct3 | [{"asset2": "10"}]
>
> But only draw back here is we have to parse the json to do the sum of the
> values, is there any other way to handle this scenario.
>
> Regards,
> Bala.
>
> On Wed, Sep 26, 2018, 2:25 PM Shawn Heisey  wrote:
>
>> On 9/26/2018 12:20 PM, Balanathagiri Ayyasamypalanivel wrote:
>> > Currently I am storing json object type of values in string field in
>> solr.
>> > Using this field, in the code I am parsing json objects and doing sum of
>> > the values under it.
>> >
>> > In solr, do we have any option in doing it by default when using the
>> json
>> > object field values.
>>
>> Even if you have JSON-formatted strings in Solr, Solr doesn't know
>> this.  It has no idea that the data is JSON, and won't be able to do
>> anything special with the info contained there.
>>
>> Thanks,
>> Shawn
>>
>>


Re: Solr Search Special Characters

2018-09-27 Thread Balanathagiri Ayyasamypalanivel
Hi,
You can escape all the characters by using \ .
Ex :
\&
\-


But it will not work only for "&" special character if you directly try in
browser.
It will work when use solr apis in the code.

Regards,
Bala.


On Thu, Sep 27, 2018, 6:52 AM Shawn Heisey  wrote:

> On 9/26/2018 10:39 PM, Rathor, Piyush (US - Philadelphia) wrote:
> > We are facing some issues in search with special characters. Can you
> please help in query if the search is done using following characters:
> >
> > • “&”
> > • AND
> > • (
> > • )
>
> There are two ways.  One is to escape them.  The escaping character in
> Solr is a backslash.
>
> \&
> AN\D
> \(
> \)
>
> The other is to use a technique or a query parser that treats all
> characters as literal.  Putting information inside double quotes
> sometimes works, but this also makes the query a phrase query, which
> might produce incorrect results.  One query parser that treats all
> characters literally is the field parser:
>
>
> https://lucene.apache.org/solr/guide/6_6/other-parsers.html#OtherParsers-FieldQueryParser
>
> If you use escaping, you don't have to change your query parser, which
> can completely change how your query text is interpreted.  Sometimes
> changing the query parser is the best option, sometimes it isn't.
>
> URL encoding the characters as Atita suggests won't help. That's
> something you have to do anyway just to get certain characters to be
> successfully sent in a URL.  Most Solr clients for programming languages
> will do the URL encoding for you.
>
> Thanks,
> Shawn
>
>


Re: Json object values in solr string field

2018-09-26 Thread Balanathagiri Ayyasamypalanivel
Hi,

Thanks for the reply, actually we are planning to optimize the huge volume
of data.

For example, in our current system we have as below, so we can do facet
pivot or stats to get the sum of asset_td for each acct, but the data
growing lot whenever more asset getting added.

Id | Accts| assetid | asset_td
1| Acct1 | asset1 | 20
2| Acct1 | asset2 | 30
3| Acct2 | asset3 | 10
4| Acct3 | asset2 | 10

So we planned to change as

Id | Accts | asset_s
1  | Acct1 | [{"asset1": "20", "asset2":"30"}]
2  | Acct2 | [{"asset3": "10"}]
3  | Acct3 | [{"asset2": "10"}]

But only draw back here is we have to parse the json to do the sum of the
values, is there any other way to handle this scenario.

Regards,
Bala.

On Wed, Sep 26, 2018, 2:25 PM Shawn Heisey  wrote:

> On 9/26/2018 12:20 PM, Balanathagiri Ayyasamypalanivel wrote:
> > Currently I am storing json object type of values in string field in
> solr.
> > Using this field, in the code I am parsing json objects and doing sum of
> > the values under it.
> >
> > In solr, do we have any option in doing it by default when using the json
> > object field values.
>
> Even if you have JSON-formatted strings in Solr, Solr doesn't know
> this.  It has no idea that the data is JSON, and won't be able to do
> anything special with the info contained there.
>
> Thanks,
> Shawn
>
>


Json object values in solr string field

2018-09-26 Thread Balanathagiri Ayyasamypalanivel
Hi,
Currently I am storing json object type of values in string field in solr.
Using this field, in the code I am parsing json objects and doing sum of
the values under it.

In solr, do we have any option in doing it by default when using the json
object field values.

Regards,
Bala.