On Wed, Dec 2, 2015 at 4:53 PM, Malith Dhanushka <mal...@wso2.com> wrote:

>
>
> On Wed, Dec 2, 2015 at 4:47 PM, Sinthuja Ragendran <sinth...@wso2.com>
> wrote:
>
>> Hi Malith,
>>
>> On Wed, Dec 2, 2015 at 4:41 PM, Malith Dhanushka <mal...@wso2.com> wrote:
>>
>>> Hi Folks,
>>>
>>> We had an offline chat about this.
>>>
>>> Since indexing all the arbitrary fields is not feasible with the current
>>> architecture, requirement of indexing arbitrary fields in log analyzer will
>>> be handled in Log analyzer REST API. Idea is to compare the incoming event
>>> with existing schema which is kept in in-memory and if there is a change
>>> then to update the table schema.
>>>
>>
>> In this case, all the fields are going to be indexed? Is there any way
>> with this solution to say I need specific fields (say x, y, z) to be
>> indexed in the log event and not all the fields?
>>
>
> No. In this way client wont send the table schema before hand. Up on the
> change of an event , REST API will dynamically update the schema.  Since
> this is log analyzer specific scenario , all the events needs to be
> indexed.
>

In log analyzer scenario also, is it always necessary to index all fields??
Anyhow number of indexing fields will have some influence in the resource
utilization, load, etc in the indexing operation, and hence IMHO users
should be having a way to only index the fields which they are interested
in using the log search operation. For example, I have log events with (ip
address, timestamp, operation, resource, data transferred, client-browser)
fields, but i'm only going to search by using the fields ip address,
timestamp, operation, resource and data transferred, client-browser is not
going to be part of my search field, but i wanted to see those fields in my
final log search result. Is there any way to achieve this?

Atleast the user should be able to remove the unwanted indexing field from
the table schema with management-console, or Analytics REST API, but I
think with this solution anyhow once the log event is received with
particular arbitrary field it's going to add the filed again for indexing.
Please correct me if I'm wrong.

Thanks,
Sinthuja.


>
> Thanks
>
>>
>> Thanks,
>> Sinthuja.
>>
>>>
>>> Overriding table schema will make event sink configuration inconsistent
>>> with table schema. To avoid that event sink feature needs to be improved in
>>> order to support merging table schemas. For that event persist feature
>>> should have a flag to enable/disable merging table schemas.
>>>
>>> Thanks,
>>>
>>> On Wed, Dec 2, 2015 at 1:30 PM, Sinthuja Ragendran <sinth...@wso2.com>
>>> wrote:
>>>
>>>> Hi,
>>>>
>>>> On Wed, Dec 2, 2015 at 11:05 AM, Anjana Fernando <anj...@wso2.com>
>>>> wrote:
>>>>
>>>>> On Wed, Dec 2, 2015 at 10:17 AM, Sachith Withana <sach...@wso2.com>
>>>>> wrote:
>>>>>
>>>>>> Now that we are using logstash out of the box, without the
>>>>>> DASConnector, it won't do that.
>>>>>>
>>>>>> The logstash would just start publishing and with the current design,
>>>>>> AFAIK the schema setting would be handled by the LAS server,
>>>>>>
>>>>>
>>>>> Oh yeah, I see ..
>>>>>
>>>>>
>>>>>>
>>>>>> BTW for that requirement, can we provide a way to allow indexing all
>>>>>> the columns?
>>>>>>
>>>>>
>>>>> Well .. we can .. I guess this is the same that Malith request in the
>>>>> first mail. Only thing is, we have to change the internals/architecture of
>>>>> how we do indexing currently, the current logic is, we check the input
>>>>> value against the table schema, and do the required indexing. For example,
>>>>> if facets are defined, data types etc.. so if we are just saying, to index
>>>>> all fields, it will be a new path there, and also we have to introduce a
>>>>> new special flag for a table to say, index all. Also, we should need some
>>>>> mechanism of figuring out the fields of a specific log type in the server,
>>>>> where at least with the table schema, we knew what are all the fields
>>>>> that's there for all the log types. Ideally, we need to store some 
>>>>> metadata
>>>>> somewhere saying, for this specific log type, these are the fields and so
>>>>> on. Do we get some kind of a log category/type information with the
>>>>> standard logstash HTTP connector? .. any other schema setting, storing of
>>>>> metadata can be done in the server side, and we can cache it in-memory to
>>>>> do fast lookups and modifications of the schema (together with some 
>>>>> cluster
>>>>> messaging to keep it in-sync with other nodes).
>>>>>
>>>>> Or else, maybe we are again back to writing our own logstash adapter
>>>>> which will make the whole thing much simpler? ..
>>>>>
>>>>
>>>> Yeah +1 , actually I was also thinking having our own logstash adaptor
>>>> will be more better and cleaner way without complicating much. :) Simply if
>>>> we are able to mention what are the fields that needs to be indexed in
>>>> client side, and then make a call to LAS REST service before publishing
>>>> data, then we can set the schema accordingly and things will work without
>>>> any big effort .
>>>>
>>>> Thanks,
>>>> Sinthuja.
>>>>
>>>>
>>>>> Cheers,
>>>>> Anjana.
>>>>>
>>>>>
>>>>>>
>>>>>> On Wed, Dec 2, 2015 at 10:11 AM, Anjana Fernando <anj...@wso2.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Hi Sachith,
>>>>>>>
>>>>>>> Doesn't the agent have the knowledge of the log types/categories and
>>>>>>> their field information when it is initializing? .. as in, as I 
>>>>>>> understood,
>>>>>>> we give what fields needs to be sent out in the configurations, isn't 
>>>>>>> that
>>>>>>> the case? ..
>>>>>>>
>>>>>>> Cheers,
>>>>>>> Anjana.
>>>>>>>
>>>>>>> On Wed, Dec 2, 2015 at 10:01 AM, Sachith Withana <sach...@wso2.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Hi All,
>>>>>>>>
>>>>>>>> There might be a slight issue. We wouldn't know the arbitrary
>>>>>>>> fields before the log agent starts publishing, since the agent only
>>>>>>>> publishes and we don't have control over which fields would be sent (
>>>>>>>> unless we configure all the agents ourselves). So we would have to 
>>>>>>>> check
>>>>>>>> for each event, if there are new fields apart from that are there in 
>>>>>>>> the
>>>>>>>> schema. This is undesirable.
>>>>>>>>
>>>>>>>> And as Anjana pointed out we don't have a way to specify to index
>>>>>>>> all the arbitrary values unless we set the schema accordingly.
>>>>>>>>
>>>>>>>> Is it possible to specify in the schema to index everything?
>>>>>>>>
>>>>>>>> On Wed, Dec 2, 2015 at 9:38 AM, Anjana Fernando <anj...@wso2.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Hi Malith,
>>>>>>>>>
>>>>>>>>> The functionality which you're requesting is very specific, and
>>>>>>>>> from DAS side, it doesn't make sense to implement this in a generic 
>>>>>>>>> way,
>>>>>>>>> which is not used usually. And it is anyway not the way, the log 
>>>>>>>>> analyzer
>>>>>>>>> should use it. The different log sources, will know their fields 
>>>>>>>>> before
>>>>>>>>> they send out data, it doesn't have to be checked every time an event 
>>>>>>>>> is
>>>>>>>>> published. A log source would instruct the log analyzer backend API, 
>>>>>>>>> the
>>>>>>>>> new fields, this specific log source will be sending, and with the 
>>>>>>>>> earlier
>>>>>>>>> message, the backend service will set the global table's schema 
>>>>>>>>> properly,
>>>>>>>>> and then the remote log agent will be sending out log records to be
>>>>>>>>> processed by the server.
>>>>>>>>>
>>>>>>>>> Cheers,
>>>>>>>>> Anjana.
>>>>>>>>>
>>>>>>>>> On Tue, Dec 1, 2015 at 6:44 PM, Malith Dhanushka <mal...@wso2.com>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> Hi Anjana,
>>>>>>>>>>
>>>>>>>>>> Yes. Requirement is for the internal log related REST API which
>>>>>>>>>> is being written using osgi services. In the perspective of log 
>>>>>>>>>> analysis
>>>>>>>>>> data, we have one master table to persist all the log events from 
>>>>>>>>>> different
>>>>>>>>>> log sources. The way log data comes in to log REST API is as 
>>>>>>>>>> arbitrary
>>>>>>>>>> fields. So different log sources have different set of arbitrary 
>>>>>>>>>> fields
>>>>>>>>>> which leads log REST API to change the schema of master table every 
>>>>>>>>>> time it
>>>>>>>>>> receives log events from a new/updated log source. That's what i 
>>>>>>>>>> meant
>>>>>>>>>> inaccurate which can be solved much cleaner way by having that flag 
>>>>>>>>>> to
>>>>>>>>>> index or not to index arbitrary fields for a particular stream.
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>> Malith
>>>>>>>>>>
>>>>>>>>>> On Tue, Dec 1, 2015 at 6:06 PM, Anjana Fernando <anj...@wso2.com>
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>>> Hi Malith,
>>>>>>>>>>>
>>>>>>>>>>> No, it cannot be done like that. How the indexing and all
>>>>>>>>>>> happens is, it looks up the table schema for a table and do the 
>>>>>>>>>>> indexing
>>>>>>>>>>> according to that. So the table schema must be set before hand. It 
>>>>>>>>>>> is not a
>>>>>>>>>>> dynamic thing that can be set, when arbitrary fields are sent to the
>>>>>>>>>>> receiver, and it cannot always load the current schema and set it 
>>>>>>>>>>> always
>>>>>>>>>>> for each event, even though we can cache that information and do 
>>>>>>>>>>> some
>>>>>>>>>>> operations, but that gets complicated. So the idea is, it is the
>>>>>>>>>>> responsibility of the client to set the target table's schema 
>>>>>>>>>>> properly
>>>>>>>>>>> before hand, which may or may not include arbitrary fields, and 
>>>>>>>>>>> then send
>>>>>>>>>>> the data.
>>>>>>>>>>>
>>>>>>>>>>> Also, if this requirement is for the log analytics solution
>>>>>>>>>>> work, as we've discussed before, there should be a whole new remote 
>>>>>>>>>>> API for
>>>>>>>>>>> that, and that API can do these operations inside the server, using 
>>>>>>>>>>> the
>>>>>>>>>>> OSGi services, and not the original DAS REST API. So those 
>>>>>>>>>>> operations will
>>>>>>>>>>> happen automatically while keeping the remote log related API clean.
>>>>>>>>>>>
>>>>>>>>>>> Cheers,
>>>>>>>>>>> Anjana.
>>>>>>>>>>>
>>>>>>>>>>> On Tue, Dec 1, 2015 at 5:13 PM, Malith Dhanushka <
>>>>>>>>>>> mal...@wso2.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Hi Folks,
>>>>>>>>>>>>
>>>>>>>>>>>> Currently indexing arbitrary fields is being achieved by
>>>>>>>>>>>> dynamically updating analytics table schema through analytics REST 
>>>>>>>>>>>> API.
>>>>>>>>>>>> This is not an accurate solution for a frequently updating schema. 
>>>>>>>>>>>> So the
>>>>>>>>>>>> ideal solution would be to have a flag in data bridge event sink
>>>>>>>>>>>> configuration to enable/disable indexing for all arbitrary fields.
>>>>>>>>>>>>
>>>>>>>>>>>> WDUT?
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks,
>>>>>>>>>>>> Malith
>>>>>>>>>>>> --
>>>>>>>>>>>> Malith Dhanushka
>>>>>>>>>>>> Senior Software Engineer - Data Technologies
>>>>>>>>>>>> *WSO2, Inc. : wso2.com <http://wso2.com/>*
>>>>>>>>>>>> *Mobile*          : +94 716 506 693
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> --
>>>>>>>>>>> *Anjana Fernando*
>>>>>>>>>>> Senior Technical Lead
>>>>>>>>>>> WSO2 Inc. | http://wso2.com
>>>>>>>>>>> lean . enterprise . middleware
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>> Malith Dhanushka
>>>>>>>>>> Senior Software Engineer - Data Technologies
>>>>>>>>>> *WSO2, Inc. : wso2.com <http://wso2.com/>*
>>>>>>>>>> *Mobile*          : +94 716 506 693
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> *Anjana Fernando*
>>>>>>>>> Senior Technical Lead
>>>>>>>>> WSO2 Inc. | http://wso2.com
>>>>>>>>> lean . enterprise . middleware
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> Sachith Withana
>>>>>>>> Software Engineer; WSO2 Inc.; http://wso2.com
>>>>>>>> E-mail: sachith AT wso2.com
>>>>>>>> M: +94715518127
>>>>>>>> Linked-In: <http://goog_416592669>
>>>>>>>> https://lk.linkedin.com/in/sachithwithana
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> *Anjana Fernando*
>>>>>>> Senior Technical Lead
>>>>>>> WSO2 Inc. | http://wso2.com
>>>>>>> lean . enterprise . middleware
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Sachith Withana
>>>>>> Software Engineer; WSO2 Inc.; http://wso2.com
>>>>>> E-mail: sachith AT wso2.com
>>>>>> M: +94715518127
>>>>>> Linked-In: <http://goog_416592669>
>>>>>> https://lk.linkedin.com/in/sachithwithana
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> *Anjana Fernando*
>>>>> Senior Technical Lead
>>>>> WSO2 Inc. | http://wso2.com
>>>>> lean . enterprise . middleware
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> *Sinthuja Rajendran*
>>>> Associate Technical Lead
>>>> WSO2, Inc.:http://wso2.com
>>>>
>>>> Blog: http://sinthu-rajan.blogspot.com/
>>>> Mobile: +94774273955
>>>>
>>>>
>>>>
>>>
>>>
>>> --
>>> Malith Dhanushka
>>> Senior Software Engineer - Data Technologies
>>> *WSO2, Inc. : wso2.com <http://wso2.com/>*
>>> *Mobile*          : +94 716 506 693
>>>
>>
>>
>>
>> --
>> *Sinthuja Rajendran*
>> Associate Technical Lead
>> WSO2, Inc.:http://wso2.com
>>
>> Blog: http://sinthu-rajan.blogspot.com/
>> Mobile: +94774273955
>>
>>
>>
>
>
> --
> Malith Dhanushka
> Senior Software Engineer - Data Technologies
> *WSO2, Inc. : wso2.com <http://wso2.com/>*
> *Mobile*          : +94 716 506 693
>



-- 
*Sinthuja Rajendran*
Associate Technical Lead
WSO2, Inc.:http://wso2.com

Blog: http://sinthu-rajan.blogspot.com/
Mobile: +94774273955
_______________________________________________
Dev mailing list
Dev@wso2.org
http://wso2.org/cgi-bin/mailman/listinfo/dev

Reply via email to