Errata:  if people thinks it’s ok…

On 6/22/17, 1:48 PM, "Barona, Ricardo" <[email protected]> wrote:

    Completely agree. We recently incorporated this mark down document to 
spot-ml folder: 
https://github.com/apache/incubator-spot/blob/master/spot-ml/SUSPICIOUS_CONNECTS_SCHEMA.md.
 But we can always improve. 
    
    Going back to the main issue, if people things it’s ok I’ll create an issue 
for:
    
    - Spot-ml check schema for Flow, DNS and Proxy input data
    - Make more consistent the documentation about required schema for spot-ml 
when not using spot-ingest
    
    
    On 6/22/17, 1:37 PM, "Michael Ridley" <[email protected]> wrote:
    
        I agree, having a data model defined and documented would help a lot in
        separating processing from a specific ingest flow.
        
        Michael
        
        On Thu, Jun 22, 2017 at 1:31 PM, Jonathan Natkins <[email protected]>
        wrote:
        
        > Personally, I'd love for there to be more information about the 
expected
        > schema for the ML jobs, as well as information about where the data 
can be
        > picked up from. The documentation seems to be mostly written with a
        > specific example in mind, so is not extremely helpful when trying to
        > integrate new data sources. A data dictionary would help with being 
able to
        > map fields from data formats (other logs, etc) to fields that spot-ml 
can
        > process.
        >
        > Whatever happened to the open data model that was being discussed for 
Spot?
        >
        > Thanks!
        > Natty
        >
        > On Thu, Jun 22, 2017 at 10:10 AM Barona, Ricardo 
<[email protected]
        > >
        > wrote:
        >
        > > Hi everyone.
        > >
        > > I’m happy to see how more people is playing with Spot and 
particularly
        > > with spot-ml everytime.
        > >
        > > Something that I’ve noticed thanks to these two Jira issues (
        > > https://issues.apache.org/jira/browse/SPOT-149 and
        > > https://issues.apache.org/jira/browse/SPOT-174) is that sometimes 
users
        > > are going to want to try spot-ml without ingesting data using 
spot-ingest
        > > and I think that’s cool but seems like that can lead to inconsistent
        > schema
        > > issues.
        > >
        > > I’d like to know what you think, what would be the best approach to 
deal
        > > with this; I’m thinking that we can add schema validation to spot-ml
        > before
        > > anything else happens but I don’t know if that’s going to lock 
things too
        > > much.
        > >
        > > Please share your thoughts.
        > >
        > > Thanks,
        > > Ricardo Barona
        > >
        > --
        > Jonathan "Natty" Natkins
        > StreamSets | Field Engineering Director
        > mobile: 609.577.1600 | linkedin <http://www.linkedin.com/in/nattyice>
        >
        
        
        
        -- 
        Michael Ridley <[email protected]>
        office: (650) 352-1337
        mobile: (571) 438-2420
        Senior Solutions Architect
        Cloudera, Inc.
        
    
    

Reply via email to