On Wed, Dec 2, 2015 at 10:17 AM, Sachith Withana <sach...@wso2.com> wrote:

> Now that we are using logstash out of the box, without the DASConnector,
> it won't do that.
>
> The logstash would just start publishing and with the current design,
> AFAIK the schema setting would be handled by the LAS server,
>

Oh yeah, I see ..


>
> BTW for that requirement, can we provide a way to allow indexing all the
> columns?
>

Well .. we can .. I guess this is the same that Malith request in the first
mail. Only thing is, we have to change the internals/architecture of how we
do indexing currently, the current logic is, we check the input value
against the table schema, and do the required indexing. For example, if
facets are defined, data types etc.. so if we are just saying, to index all
fields, it will be a new path there, and also we have to introduce a new
special flag for a table to say, index all. Also, we should need some
mechanism of figuring out the fields of a specific log type in the server,
where at least with the table schema, we knew what are all the fields
that's there for all the log types. Ideally, we need to store some metadata
somewhere saying, for this specific log type, these are the fields and so
on. Do we get some kind of a log category/type information with the
standard logstash HTTP connector? .. any other schema setting, storing of
metadata can be done in the server side, and we can cache it in-memory to
do fast lookups and modifications of the schema (together with some cluster
messaging to keep it in-sync with other nodes).

Or else, maybe we are again back to writing our own logstash adapter which
will make the whole thing much simpler? ..

Cheers,
Anjana.


>
> On Wed, Dec 2, 2015 at 10:11 AM, Anjana Fernando <anj...@wso2.com> wrote:
>
>> Hi Sachith,
>>
>> Doesn't the agent have the knowledge of the log types/categories and
>> their field information when it is initializing? .. as in, as I understood,
>> we give what fields needs to be sent out in the configurations, isn't that
>> the case? ..
>>
>> Cheers,
>> Anjana.
>>
>> On Wed, Dec 2, 2015 at 10:01 AM, Sachith Withana <sach...@wso2.com>
>> wrote:
>>
>>> Hi All,
>>>
>>> There might be a slight issue. We wouldn't know the arbitrary fields
>>> before the log agent starts publishing, since the agent only publishes and
>>> we don't have control over which fields would be sent ( unless we configure
>>> all the agents ourselves). So we would have to check for each event, if
>>> there are new fields apart from that are there in the schema. This is
>>> undesirable.
>>>
>>> And as Anjana pointed out we don't have a way to specify to index all
>>> the arbitrary values unless we set the schema accordingly.
>>>
>>> Is it possible to specify in the schema to index everything?
>>>
>>> On Wed, Dec 2, 2015 at 9:38 AM, Anjana Fernando <anj...@wso2.com> wrote:
>>>
>>>> Hi Malith,
>>>>
>>>> The functionality which you're requesting is very specific, and from
>>>> DAS side, it doesn't make sense to implement this in a generic way, which
>>>> is not used usually. And it is anyway not the way, the log analyzer should
>>>> use it. The different log sources, will know their fields before they send
>>>> out data, it doesn't have to be checked every time an event is published. A
>>>> log source would instruct the log analyzer backend API, the new fields,
>>>> this specific log source will be sending, and with the earlier message, the
>>>> backend service will set the global table's schema properly, and then the
>>>> remote log agent will be sending out log records to be processed by the
>>>> server.
>>>>
>>>> Cheers,
>>>> Anjana.
>>>>
>>>> On Tue, Dec 1, 2015 at 6:44 PM, Malith Dhanushka <mal...@wso2.com>
>>>> wrote:
>>>>
>>>>> Hi Anjana,
>>>>>
>>>>> Yes. Requirement is for the internal log related REST API which is
>>>>> being written using osgi services. In the perspective of log analysis 
>>>>> data,
>>>>> we have one master table to persist all the log events from different log
>>>>> sources. The way log data comes in to log REST API is as arbitrary fields.
>>>>> So different log sources have different set of arbitrary fields which 
>>>>> leads
>>>>> log REST API to change the schema of master table every time it receives
>>>>> log events from a new/updated log source. That's what i meant inaccurate
>>>>> which can be solved much cleaner way by having that flag to index or not 
>>>>> to
>>>>> index arbitrary fields for a particular stream.
>>>>>
>>>>> Thanks,
>>>>> Malith
>>>>>
>>>>> On Tue, Dec 1, 2015 at 6:06 PM, Anjana Fernando <anj...@wso2.com>
>>>>> wrote:
>>>>>
>>>>>> Hi Malith,
>>>>>>
>>>>>> No, it cannot be done like that. How the indexing and all happens is,
>>>>>> it looks up the table schema for a table and do the indexing according to
>>>>>> that. So the table schema must be set before hand. It is not a dynamic
>>>>>> thing that can be set, when arbitrary fields are sent to the receiver, 
>>>>>> and
>>>>>> it cannot always load the current schema and set it always for each 
>>>>>> event,
>>>>>> even though we can cache that information and do some operations, but 
>>>>>> that
>>>>>> gets complicated. So the idea is, it is the responsibility of the client 
>>>>>> to
>>>>>> set the target table's schema properly before hand, which may or may not
>>>>>> include arbitrary fields, and then send the data.
>>>>>>
>>>>>> Also, if this requirement is for the log analytics solution work, as
>>>>>> we've discussed before, there should be a whole new remote API for that,
>>>>>> and that API can do these operations inside the server, using the OSGi
>>>>>> services, and not the original DAS REST API. So those operations will
>>>>>> happen automatically while keeping the remote log related API clean.
>>>>>>
>>>>>> Cheers,
>>>>>> Anjana.
>>>>>>
>>>>>> On Tue, Dec 1, 2015 at 5:13 PM, Malith Dhanushka <mal...@wso2.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Hi Folks,
>>>>>>>
>>>>>>> Currently indexing arbitrary fields is being achieved by dynamically
>>>>>>> updating analytics table schema through analytics REST API. This is not 
>>>>>>> an
>>>>>>> accurate solution for a frequently updating schema. So the ideal 
>>>>>>> solution
>>>>>>> would be to have a flag in data bridge event sink configuration to
>>>>>>> enable/disable indexing for all arbitrary fields.
>>>>>>>
>>>>>>> WDUT?
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Malith
>>>>>>> --
>>>>>>> Malith Dhanushka
>>>>>>> Senior Software Engineer - Data Technologies
>>>>>>> *WSO2, Inc. : wso2.com <http://wso2.com/>*
>>>>>>> *Mobile*          : +94 716 506 693
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> *Anjana Fernando*
>>>>>> Senior Technical Lead
>>>>>> WSO2 Inc. | http://wso2.com
>>>>>> lean . enterprise . middleware
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Malith Dhanushka
>>>>> Senior Software Engineer - Data Technologies
>>>>> *WSO2, Inc. : wso2.com <http://wso2.com/>*
>>>>> *Mobile*          : +94 716 506 693
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> *Anjana Fernando*
>>>> Senior Technical Lead
>>>> WSO2 Inc. | http://wso2.com
>>>> lean . enterprise . middleware
>>>>
>>>
>>>
>>>
>>> --
>>> Sachith Withana
>>> Software Engineer; WSO2 Inc.; http://wso2.com
>>> E-mail: sachith AT wso2.com
>>> M: +94715518127
>>> Linked-In: <http://goog_416592669>
>>> https://lk.linkedin.com/in/sachithwithana
>>>
>>
>>
>>
>> --
>> *Anjana Fernando*
>> Senior Technical Lead
>> WSO2 Inc. | http://wso2.com
>> lean . enterprise . middleware
>>
>
>
>
> --
> Sachith Withana
> Software Engineer; WSO2 Inc.; http://wso2.com
> E-mail: sachith AT wso2.com
> M: +94715518127
> Linked-In: <http://goog_416592669>
> https://lk.linkedin.com/in/sachithwithana
>



-- 
*Anjana Fernando*
Senior Technical Lead
WSO2 Inc. | http://wso2.com
lean . enterprise . middleware
_______________________________________________
Dev mailing list
Dev@wso2.org
http://wso2.org/cgi-bin/mailman/listinfo/dev

Reply via email to