Re: nfdump install instructions seem inaccurate

2020-02-04 Thread Nate Smith
I’ve not been active in this project for a while, so forgive me if this has 
already been addressed. 

I’m assuming this is for spot-nfdump?

If so it’s worth saying unless someone has been maintaining it we should look 
at removing the dependency. 

Originally we cooked our own version due to time stamp issues. But this should 
really be handled post ingestion IMHO. 

- nathanael 

> On Feb 4, 2020, at 3:40 PM, skip cruse  wrote:
> 
> I noticed when setting up nfdump for spot-ingest that there were some
> errors around the version of automake that’s requried. Apparently automake
> 1.14 is required and after installing that everything worked fine. I did
> some digging and it looks like this was already raised via SPOT-178:
> https://issues.apache.org/jira/browse/SPOT-178. Perhaps someone could
> update the website to reflect this dependency and add the additional setup
> steps in there as well?
> 
> Cheers,
> Skip
> -- 
> if( bool halfWayThere == true);
> printf "WAH! LIVIN ON A PRAYER";


Re: [SPOT-INGEST] Ingest file organization

2020-01-14 Thread Nate Smith
Perhaps separating by framework would be good,

./spot-ingest/python
./spot-ingest/spark-streaming

Just my 2 cents,

- nathanael 

> On Jan 14, 2020, at 4:45 PM, Skip Cruse  wrote:
> 
> We should keep the name /spot-ingest/ for the original ingester, but move 
> the new ingester to /spot-ingest-sparkstreaming/ or similar.  Hopefully we 
> can use the ticket to track down the files that were created, so we can move 
> them to a new home easily.
> 
> Get Outlook for iOS
> 
> 
> From: Tadd Wood 
> Sent: Tuesday, January 14, 2020 5:51 PM
> To: dev@spot.incubator.apache.org
> Subject: [SPOT-INGEST] Ingest file organization
> 
> I noticed that after SPOT-141 was introduced (a new kind of Spot Ingest,
> using PySpark Streaming) that it overlaid the new code on top of the old
> code on /spot-ingest/. When debugging the code, it makes it hard to
> determine which files are relevant to the new or the old ingest process. We
> should split them apart. Thoughts?
> 
> Thank you,
> Tadd Wood


Re: ODM Merge?

2019-03-21 Thread Nate Smith
As I recall I think we can merge envelope without affecting the existing code. 
But as you pointed out there are other gaps such as the UI. 
At this point I say merge as much as we can (assuming no obvious code quality 
issues) as any movement at this point is positive. If something breaks then we 
know what needs to be fixed. 

- nathanael 

> On Mar 21, 2019, at 6:07 PM, Tadd Wood  wrote:
> 
> Alan,
> 
> I can help organize the open PRs.  Right now the biggest barrier to merging 
> in the ODM branch is bridging the gap between the ingest code and the ODM.  
> @curtishoward did some great work in PR #144 using Envelope as the ingest 
> framework for populating the ODM.  I will reach out to see what work is left 
> to finish up that PR so we can merge it in.
> 
> Thank you,
> Tadd Wood
> 
> 
>> On Mar 21, 2019, at 4:02 PM, Alan Ross  wrote:
>> 
>> thanks for the reply, Pierre-Luc.
>> 
>> Any input on merging PRs? Is there a list of current open and which ones
>> have been reviewed?
>> 
>>> On Thu, Mar 21, 2019 at 1:10 PM Pierre-Luc Dion  wrote:
>>> 
>>> Look like there few pending PR waiting to be merge to this branch, wouldn't
>>> it make sense to merge all that first, then merge SPOT-181_odm branch into
>>> master?
>>> I'm not committer so I can't help on that but I can help with review
>>> wherever it's possible.
>>> 
>>> the PR pile look stalled, a lot of PR are becoming hold :-(
>>> 
 On Tue, Mar 19, 2019 at 4:59 PM Alan Ross  wrote:
 
 Hey team,
 
 It's hard for people to find the ODM as it appears to be tied up in
>>> request
 181. Can someone merge this? Not sure if we need to bring it for vote
>>> but I
 support it being merged.
 
 
 
>>> https://github.com/apache/incubator-spot/blob/SPOT-181_ODM/docs/open-data-model.md
 
 Thanks, Alan
 
>>> 
> 


Re: Spot Meeting & Notes Nov 30, 2018

2018-11-30 Thread Nate Smith
DNS - I think someone with access will need to manually check the share looking 
for that
As far as I know nothing has been deleted. 

- nathanael 

> On Nov 30, 2018, at 12:30 PM, Mark Schoeni  wrote:
> 
> Hello everyone,
> Below are the updates from the past couple of weeks.
> Also, we have a podling report due soon. Will state we need some advice on
> community building and at least 1 more mentor when we start the release
> process.
> 
> Mentioned last week, I believe Curtis is still looking for the "DNS labeled
> data set" mentioned in this document (DATA_SAMPLE.md
> ).
> If anyone has a copy could you please give us a link so we can make it
> publicly available again; thanks.
> 
> *JIRA
> *
> *SPOT-287* *spot-setup/hdfs_setup.sh fail to execute `sudo` commands*
> Notes: Needs 2 more +1s before we can accept and close this merge request.
> 
> *SPOT-288* *TypeError: () got an unexpected keyword argument 'date'*
> Notes: Bug regarding the GraphQL
> 
> *SPOT-289* *GraphQLLocatedError*
> 
> *Pull Reqs *
> *PR#151:* Refer above to SPOT-287
> 
> *PR#150:* Looking for more information on if this PR is trying to resolve
> SPOT-181 or is solving a much smaller issue which would need a new JIRA
> issue. Please let me know if anyone has more information.
> 
> As usual I will be on the meeting (https://meet.google.com/tpt-gwcv-qgv) if
> anyone has questions or concerns they would like to bring up.


Re: Podling Report Reminder - August 2018

2018-08-06 Thread Nate Smith
It seems we are already past the deadline for a report,

Has anyone drafted a proposed report yet?

- Nathanael

> On Aug 6, 2018, at 6:19 AM, jmcl...@apache.org wrote:
> 
> Dear podling,
> 
> This email was sent by an automated system on behalf of the Apache
> Incubator PMC. It is an initial reminder to give you plenty of time to
> prepare your quarterly board report.
> 
> The board meeting is scheduled for Wed, 15 August 2018, 10:30 am PDT.
> The report for your podling will form a part of the Incubator PMC
> report. The Incubator PMC requires your report to be submitted 2 weeks
> before the board meeting, to allow sufficient time for review and
> submission (Wed, August 01).
> 
> Please submit your report with sufficient time to allow the Incubator
> PMC, and subsequently board members to review and digest. Again, the
> very latest you should submit your report is 2 weeks prior to the board
> meeting.
> 
> Candidate names should not be made public before people are actually
> elected, so please do not include the names of potential committers or
> PPMC members in your report.
> 
> Thanks,
> 
> The Apache Incubator PMC
> 
> Submitting your Report
> 
> --
> 
> Your report should contain the following:
> 
> *   Your project name
> *   A brief description of your project, which assumes no knowledge of
>the project or necessarily of its field
> *   A list of the three most important issues to address in the move
>towards graduation.
> *   Any issues that the Incubator PMC or ASF Board might wish/need to be
>aware of
> *   How has the community developed since the last report
> *   How has the project developed since the last report.
> *   How does the podling rate their own maturity.
> 
> This should be appended to the Incubator Wiki page at:
> 
> https://wiki.apache.org/incubator/August2018
> 
> Note: This is manually populated. You may need to wait a little before
> this page is created from a template.
> 
> Mentors
> ---
> 
> Mentors should review reports for their project(s) and sign them off on
> the Incubator wiki page. Signing off reports shows that you are
> following the project - projects that are not signed may raise alarms
> for the Incubator PMC.
> 
> Incubator PMC



Re: Spot Project Meeting - Date & Time is Set

2018-07-27 Thread Nate Smith
https://hangouts.google.com/hangouts/_/7wkrf7axonf6dczn2dlm6kxntme

This should work

> On Jul 27, 2018, at 12:31 PM, Austin Leahy  wrote:
> 
> can someone provide me a direct link to the meeting I can't seem to join
> 
> On Fri, Jul 27, 2018 at 12:08 PM Curtis Howard 
> wrote:
> 
>> Hi Mark - I'll be on as well.
>> 
>> Thanks
>> Curtis
>> 
>> On Fri, Jul 27, 2018 at 2:54 PM Austin Leahy 
>> wrote:
>> 
>>> I am also able to attend
>>> 
>>> 
>>> On Fri, Jul 27, 2018 at 11:27 AM Nate Smith 
>> wrote:
>>> 
>>>> Thanks Mark,
>>>> I will be there.
>>>> 
>>>> - nathanael
>>>> 
>>>>> On Jul 26, 2018, at 8:40 PM, Mark Schoeni 
>>> wrote:
>>>>> 
>>>>> Hello Everyone,
>>>>> This is a reminder about tomorrow's (July 27) Spot Project meeting
>> from
>>>>> *3:30* to *4:30* (*ET*).
>>>>> 
>>>>> We will be using Google Hangouts.
>>>>> Spot Meeting Link
>>>>> <
>>>> 
>>> 
>> https://calendar.google.com/event?action=TEMPLATE=NWNudjdoODg0NjRobTdpcmFtZHBuY2I1MG0gNG9scWwxamdyOGFrbzdyaG9iZzMwY25lOThAZw=4olql1jgr8ako7rhobg30cne98%40group.calendar.google.com
>>>>> 
>>>>> 
>>>>> 
>>>>> The agenda is available below.
>>>>> 
>>>>> -- Forwarded message -
>>>>> From: Mark Schoeni 
>>>>> Date: Tue, Jul 17, 2018 at 3:47 PM
>>>>> Subject: Spot Project Meeting - Date & Time is Set
>>>>> To: 
>>>>> 
>>>>> 
>>>>> Hello Everyone,
>>>>> I've compiled the results of the survey concerning the Spot Project
>>>> Meeting.
>>>>> The best time for attendance (according to the survey) will be on:
>>>>> 
>>>>> *Friday* (July *27*) from *3:30*pm - *4:30*pm (*ET*)
>>>>> 
>>>>> We will use Google Hangouts
>>>>> Spot Meeting Link
>>>>> <
>>>> 
>>> 
>> https://calendar.google.com/event?action=TEMPLATE=NWNudjdoODg0NjRobTdpcmFtZHBuY2I1MG0gNG9scWwxamdyOGFrbzdyaG9iZzMwY25lOThAZw=4olql1jgr8ako7rhobg30cne98%40group.calendar.google.com
>>>>> 
>>>>> 
>>>>> Look forward to seeing everyone there.
>>>>> If anyone misses it, we will try to take meeting notes and distribute
>>>> them
>>>>> after through the dev. list.
>>>>> 
>>>>> *Agenda*
>>>>> 
>>>>>  - Current status and feedback from Spot users (what's working and
>>> what
>>>>>  isn't)
>>>>>  - Next release (what should it contain and who can do what)
>>>>>  - How can we better build a community?
>>>>>  - Roadmap discussion
>>>>>  - Open Forum
>>>>> 
>>>>> 
>>>>> Thank you,
>>>>> 
>>>>> - Mark Schoeni
>>>> 
>>> 
>> 



Re: Proposed Addition to SPOT

2018-06-26 Thread Nate Smith
+1

What would be the best medium to conduct the meetings?
A recording should be made and posted and it should be open to the whole mail 
list

- nathanael 

> On Jun 26, 2018, at 8:46 AM, Curtis Howard  
> wrote:
> 
> +1
> 
> On Mon, Jun 25, 2018 at 12:27 PM Morris Hicks 
> wrote:
> 
>> +1
>> 
>> On Mon, Jun 25, 2018 at 11:59 AM, Tadd Wood 
>> wrote:
>> 
>>> Mark,
>>> 
>>> I would also be open to regular project meetings/discussions.
>>> 
>>> t...@arcadiadata.com
>>> 
>>> Thank you,
>>> Tadd Wood
>>> 
 On Jun 25, 2018, at 8:47 AM, Chad Perkins  wrote:
 
 I would definitely be interested in a monthly meeting.
 
 Chad Perkins
 214-296-2045
 
 On Jun 25, 2018, at 10:15 AM, Mark Schoeni >> > wrote:
 
 Thank you all for the support.
 Cesar, Thank you for sending me the link. I have signed the paperwork
>>> (attached), but I can't seem to find the proper place to submit it.
 
 Also, I just started working at Cloudera where I got connected to
>>> Morris. After being introduced to the project, my understanding is that
>>> there aren't any team meet-ups. It could be beneficial if we met once a
>>> month to talk about the state of the project. I searched the Legal Jira
>>> page and have not seen any comments or instructions on how to run project
>>> meetings.
 
 Would anyone be interested in meeting once a month to discuss the state
>>> of the project?
 
 Thanks again.
 
 On Wed, Jun 20, 2018 at 11:34 PM Cesar Berho >> ce...@apache.org>> wrote:
 Hi Mark,
 
 The proposal looks great and definitely is worth having an extended
 discussion about it.
 
 Also I would ask please if you can complete and sign the ASF ICLA so
>> you
 receive the proper acknowledgment for your contributions to the
>> project:
 
 https://www.apache.org/licenses/icla.pdf
 
 
 Thanks,
 Cesar
 
> On Wed, Jun 20, 2018 at 1:58 PM Mark Schoeni >> > wrote:
> 
> Hello Everyone,
> My name is Mark Schoeni and I am new to the SPOT initiative.
> I would like to add some features to spot and I have some slides
>>> explaining
> the additions (​​SPOT-EntityModeling
> <
> https://docs.google.com/presentation/d/1pP1w8sEenvOyWyBwv8gtOpfqybXn-
>>> eFFurGgs2EyZCQ/edit?usp=sharing
>> ).
> Any feedback would be greatly appreciated. I have a short summary of
>> the
> proposal below.
> 
> *Summary* (*TLDR*)
> I will be adding entity profiling capabilities to SPOT. This will
>> allow
>>> us
> to generate "Features" from the data in the ODM table. A Feature is an
> aggregate of data taken at user defined iterations. For example,
>>> Tracking
> the login attempt count of each user per day. This would allow an
> administrator to alert on a sudden spike in login attempts. I am
>>> proposing
> creating an interface to allow users to easily add Features and be
>> able
>>> to
> trend and alert on them.
> 
> Thank you,
> 
>>> 
>> 
>> 
>> 
>> --
>> Morris Hicks
>> Strategic Enablement Expert, Cloudera
>> mhi...@cloudera.com
>> (703) 447-5883
>> 


Re: Podling Report Reminder - June 2018

2018-06-19 Thread Nate Smith
Hello Justin,

First congrats on the new role and thank you for the help.

There does tend to be chatter on the Spot slack channel, I have tried on
multiple occasions to redirect individuals to the dev/user list for
questions with little success.
We have discussed putting a bot on the channel to say "no one will answer
questions that should be directed to the mail list" etc.
I will look at doing this soon.

There are things happening and i'm aware of development going on at several
organizations however not a lot of it is visible at this time.

I would like to address the issue of mentors, I know that we have had this
discussion on the Incubator mail list but i think i'd like to ask for help
in this area.
Our mentors have been extremely helpful in the past, However not all of our
mentors are currently active and it's likely that we need to seek some more
help in this area.
Any suggestions?

- Nathanael

On Tue, Jun 19, 2018 at 6:21 PM Justin Mclean 
wrote:

> Hi,
>
> I noticed the podling has failed to report this month and so will need to
> report next month. This is not the first missed report. There doesn’t seem
> to be a lot of activity of the dev list is it just that there’s not much to
> report or is it that conversation about the project is happening elsewhere
> and that it needs to be brought back to the dev list? Can someone on the
> PMC or one of the mentors reply here with what they think is happening with
> this project and what might be able to be done to increase the level of
> activity.
>
> Thanks,
> Justin


[PR #141] Ingestion using Spark Streaming

2018-05-07 Thread Nate Smith
I’m very interested in this, but we need more eyes.
Please review and don’t let this die on the vine

https://github.com/apache/incubator-spot/pull/141

- Nathanael

[PR #143] clean config options via configurator.py

2018-05-07 Thread Nate Smith
Hello,

Please review:
https://github.com/apache/incubator-spot/pull/143

- Nathanael


Re: Configuration-driven ingest for the Open Data Model (ODM) using Spark Streaming (Envelope)

2018-05-01 Thread Nate Smith
Curtis, 

Have you tested this with a standard version of nfdump? Or only spot-nfdump?

- Nathanael

> On May 1, 2018, at 1:12 PM, Curtis Howard  wrote:
> 
> Hi all,
> 
> We had discussed prototyping Envelope for ingest in the past - I've
> submitted a PR for this which includes:
>  - Kafka -> Spark streaming -> ODM Hive table applications for dns, flow
> and proxy raw source data
>  - a simple alternative for source data collection/dissection using
> tshark/nfdump/unzip + Flume (sinking data to Kafka)
>  - https://github.com/apache/incubator-spot/pull/144
> 
> To quote directly from the Envelope site (https://github.com/cloudera-
> labs/envelope#envelope):
> *"Envelope is simply a pre-made Spark application that implements many of
> the tasks commonly found in ETL pipelines. In many cases, Envelope allows
> large pipelines to be developed on Spark with no coding required. When
> custom code is needed, there are pluggable points in Envelope for core
> functionality to be extended. Envelope works in batch and streaming modes."*
> 
> For example, the complete Kafka/SparkStreaming/ODM ingest application
> definition for DNS:
> https://github.com/curtishoward/incubator-spot/
> blob/SPOT-181_envelope_ingest/spot-ingest/odm/workers/spot_proxy.conf
> 
> From the perspective of the Spot project, my thoughts are that it would
> enable:
>  - faster turnaround time to ingest new source types while still allowing
> for arbitrarily complex ETL pipelines (data enrichment, data quality
> checks, etc..)
>  - simplify future integration with other storage layers (HBase, Kudu, for
> example)
>  - a framework that is simple to extend (input sources, output storage
> layers, translators, derivers, UDFs, ...)
> 
> If there is interest, I will continue to refactor the current
> implementation - centralize/integration configuration with spot.conf, test
> Kerberos integration, run performance tests and tune as possible.
> 
> In the near term, I will also add a PR with Hive views for dns/flow/proxy
> under spot-ml/ - this should enable an end-to-end proof-of-concept ODM
> implementation using Envelope.
> 
> Thanks
> Curtis



Re: Configuration-driven ingest for the Open Data Model (ODM) using Spark Streaming (Envelope)

2018-05-01 Thread Nate Smith
Thank you for all the hard work Curtis,
I will start reviewing.

- Nathanael

> On May 1, 2018, at 1:12 PM, Curtis Howard  wrote:
> 
> Hi all,
> 
> We had discussed prototyping Envelope for ingest in the past - I've
> submitted a PR for this which includes:
>  - Kafka -> Spark streaming -> ODM Hive table applications for dns, flow
> and proxy raw source data
>  - a simple alternative for source data collection/dissection using
> tshark/nfdump/unzip + Flume (sinking data to Kafka)
>  - https://github.com/apache/incubator-spot/pull/144
> 
> To quote directly from the Envelope site (https://github.com/cloudera-
> labs/envelope#envelope):
> *"Envelope is simply a pre-made Spark application that implements many of
> the tasks commonly found in ETL pipelines. In many cases, Envelope allows
> large pipelines to be developed on Spark with no coding required. When
> custom code is needed, there are pluggable points in Envelope for core
> functionality to be extended. Envelope works in batch and streaming modes."*
> 
> For example, the complete Kafka/SparkStreaming/ODM ingest application
> definition for DNS:
> https://github.com/curtishoward/incubator-spot/
> blob/SPOT-181_envelope_ingest/spot-ingest/odm/workers/spot_proxy.conf
> 
> From the perspective of the Spot project, my thoughts are that it would
> enable:
>  - faster turnaround time to ingest new source types while still allowing
> for arbitrarily complex ETL pipelines (data enrichment, data quality
> checks, etc..)
>  - simplify future integration with other storage layers (HBase, Kudu, for
> example)
>  - a framework that is simple to extend (input sources, output storage
> layers, translators, derivers, UDFs, ...)
> 
> If there is interest, I will continue to refactor the current
> implementation - centralize/integration configuration with spot.conf, test
> Kerberos integration, run performance tests and tune as possible.
> 
> In the near term, I will also add a PR with Hive views for dns/flow/proxy
> under spot-ml/ - this should enable an end-to-end proof-of-concept ODM
> implementation using Envelope.
> 
> Thanks
> Curtis



[SPOT-ML] LDA options in spot.conf

2018-04-30 Thread Nate Smith
I’m adding some checks into ml_ops.sh to avoid passing spark-submit a bunch of 
empty variables.

My question is rather the LDA_* options in spot.conf should really be 
SPK_LDA_*? 
they are variables for the spark job and yet it’s not instantly clear that they 
need to be included and can not be left blank when setting Spot up.

- Nathanael

[spot.conf] name_node in spot.conf is confusing

2018-04-13 Thread Nate Smith
Should ’name_node’ in the spot.conf be changed to something like web_hdfs?
In the latest version it should be the web hdfs address for use by python api’s.
This makes it a bit confusing for new users,

- Nathanael

Test Message [disregard]

2018-04-10 Thread Nate Smith
Testing the mail servers, disregard


Re: [Question] disabling reputation services in OA

2018-03-22 Thread Nate Smith
The answer can be found here under Enable/Disable GTI service:
https://github.com/apache/incubator-spot/tree/master/spot-oa/oa/components

> On Mar 22, 2018, at 12:32 PM, Nate Smith <natedogs...@gmail.com> wrote:
> 
> Hello,
> 
> Is there a simple way to disable reputation services as a whole using only 
> configuration?
> I recall there should be a way to disable reputation services easily.
> 
> - Nathanael



[Question] disabling reputation services in OA

2018-03-22 Thread Nate Smith
Hello,

Is there a simple way to disable reputation services as a whole using only 
configuration?
I recall there should be a way to disable reputation services easily.

- Nathanael

Spot release tar file not opening

2018-03-15 Thread Nate Smith
Hello,

Has anyone had issues with the 1.0 release tar?

http://spot.apache.org/download/ 

I’ve seen one individual download the file multiple times and not be able to 
unzip it.

I’ve just tried this from one of the mirrors and not had an issue, but I want 
to make sure I’m not missing something.

- Nathanael

Re: DNS ingestion

2018-02-27 Thread Nate Smith
Hello,

This email seemed to have slipped through the cracks. 
How is development going?
Is this still something you are interested in contributing?

- nathanael 

> On Sep 15, 2017, at 5:17 AM, Salvatore Elio  wrote:
> 
> ​Hello,
> 
> 
> 
> for an internal project we have developed a different ingestion process for 
> DNS in order to have a real time ingestion and support early enrichment of 
> the ingested data.
> 
> The ingestion process is splitted into 2 processes:
> 
> 
> 
> 1) ​From DNS data to Kafka -  An Akka Streams job based on Pcap4j 
> (https://github.com/kaitoy/pcap4j)
>  which​:
> 
> a. loop through all the filtered UDP packets on port 53 using Pcap4j;
> 
> b. convert Pcap4j packet objects to AVRO using Twitter Bijection;
> 
> c.  send Avro objects to Kafka.
> 
> 
> 
> 2) From Kafka to hive - a Spark Streaming job that read new messages on Kafka 
> and write them in partitioned HDFS parquet folder readable by Hive/Impala.
> ​
> 
> ​We would like to know your thoughts about this and if this could be 
> integrated into apache spot. If it is of interest we can share the code.
> 
> 
> 
> 
> 
> Thanks
> 
> [SIA logo]
> 
> 
> ***Internet Email Confidentiality Footer***
> Qualsiasi utilizzo non autorizzato del presente messaggio nonché dei suoi 
> allegati è vietato e potrebbe costituire reato. Se ha ricevuto per errore il 
> presente messaggio, Le saremmo grati se ci inviasse, via e-mail, una 
> comunicazione al riguardo e provvedesse nel contempo alla distruzione del 
> messaggio stesso e dei suoi eventuali allegati. Le dichiarazioni contenute 
> nel presente messaggio nonche' nei suoi eventuali allegati devono essere 
> attribuite al mittente e non possono essere necessariamente considerate come 
> autorizzate da SIA S.p.A.; le medesime dichiarazioni non impegnano SIA S.p.A. 
> nei confronti del destinatario o di terzi. SIA S.p.A. non si assume alcuna 
> responsabilita' per eventuali intercettazioni, modifiche o danneggiamenti del 
> presente messaggio e-mail.
> 
> Any unauthorized use of this e-mail or any of its attachments is prohibited 
> and could constitute an offence. If you are not the intended addressee please 
> advise immediately the sender by using the reply facility in your e-mail 
> software and destroy the message and its attachments. The statements and 
> opinions expressed in this e-mail message are those of the author of the 
> message and do not necessarily represent those of SIA S.p.A. Besides, The 
> contents of this message shall be understood as neither given nor endorsed by 
> SIA S.p.A.. SIA S.p.A. does not accept liability for corruption, interception 
> or amendment, if any, or the consequences thereof.


Re: [apache/incubator-spot] One of your dependencies may have a security vulnerability

2018-02-07 Thread Nate Smith
SPOT-262 <https://issues.apache.org/jira/browse/SPOT-262> has been opened.
https://issues.apache.org/jira/browse/SPOT-262

Assuming this is true:
>>> update suggested: jquery ~> 3.0.0.

What version should we be using besides latest?

- Nathanael

> On Feb 7, 2018, at 1:44 PM, Nate Smith <nathan...@apache.org> wrote:
> 
> Thank you for the notice,
> I’m opening a Jira right now and will work at getting this addressed.
> 
> Is there a way I can make sure that we get these notifications in the future?
> This is the first email I’ve seen regarding this and I did not get a notice 
> from GitHub of course.
> 
> - Nathanael
> 
>> On Feb 7, 2018, at 12:32 PM, David Fisher <w...@apache.org 
>> <mailto:w...@apache.org>> wrote:
>> 
>> Spot PPMC - You need to be responsive to security issues.
>> 
>> Regards,
>> Dave - your friendly Incubator Shepherd
>> 
>> On 2018/01/22 15:18:06, Greg Stein <gst...@gmail.com 
>> <mailto:gst...@gmail.com>> wrote: 
>>> Spot PPMC: FYI
>>> 
>>> -- Forwarded message --
>>> From: GitHub <notificati...@github.com <mailto:notificati...@github.com>>
>>> Date: Mon, Jan 22, 2018 at 9:03 AM
>>> Subject: [apache/incubator-spot] One of your dependencies may have a
>>> security vulnerability
>>> To: apache/incubator-spot <incubator-s...@noreply.github.com 
>>> <mailto:incubator-s...@noreply.github.com>>
>>> Cc: Security alert <security_al...@noreply.github.com 
>>> <mailto:security_al...@noreply.github.com>>
>>> 
>>> 
>>> We found a potential security vulnerabilty in one of your dependencies
>>> [image: GitHub]
>>> <http://sgmail.githubmail.com/wf/click?upn=lYxq-2FYU7yocrdKNILYalBlaoUQ7ZnNSfaod-2BRPoWgKQ-3D_w6S5n3vrKqGS7A36Z0jQnv0H94jgQYM8GX7TqkbHsZL4lRLVekrLGvsUoIhNAGCFXpNVnxDBzHy5zafBWVEwERGy1xQvT1WcV4vjgRQjszChKlBJ5qTJzlnDY3mi-2F-2BK9eTXIWE1i6wEU0lB19we8K8Y7Op6j5-2BlaLLSGmQZwurq2iZQnLMwV3LaQCwryteuhbxMJl4-2F3AbesUtE2Nd6P-2BvmGa3id4nB3dY8qh5SD9EFQfCsIkP7w-2F6avraNPlR91
>>>  
>>> <http://sgmail.githubmail.com/wf/click?upn=lYxq-2FYU7yocrdKNILYalBlaoUQ7ZnNSfaod-2BRPoWgKQ-3D_w6S5n3vrKqGS7A36Z0jQnv0H94jgQYM8GX7TqkbHsZL4lRLVekrLGvsUoIhNAGCFXpNVnxDBzHy5zafBWVEwERGy1xQvT1WcV4vjgRQjszChKlBJ5qTJzlnDY3mi-2F-2BK9eTXIWE1i6wEU0lB19we8K8Y7Op6j5-2BlaLLSGmQZwurq2iZQnLMwV3LaQCwryteuhbxMJl4-2F3AbesUtE2Nd6P-2BvmGa3id4nB3dY8qh5SD9EFQfCsIkP7w-2F6avraNPlR91>>
>>> Sign
>>> in
>>> <http://sgmail.githubmail.com/wf/click?upn=lYxq-2FYU7yocrdKNILYalBluE-2FGrtUQ7WwbM8S6nEaj0-3D_w6S5n3vrKqGS7A36Z0jQnv0H94jgQYM8GX7TqkbHsZL4lRLVekrLGvsUoIhNAGCFOKXdI41R-2FdpIP-2FcZP-2Bkll7zSX6qhyAbI-2BhpvzveN7FsSTXG7wtQ0f5obKWCAJmRgW-2BF279Fz-2BXwAyYO-2BDgU5Ux3z0nMd0Oxj-2BF0g9kBS6iCUOQrCqQHO5rwxz71Tg72zV14g-2FWbKwV9V-2Bpz60hdeL4Yj9SsjRrZBJTeRRn1ncqmPXZWsHq5Q1nkCUbFarHoE
>>>  
>>> <http://sgmail.githubmail.com/wf/click?upn=lYxq-2FYU7yocrdKNILYalBluE-2FGrtUQ7WwbM8S6nEaj0-3D_w6S5n3vrKqGS7A36Z0jQnv0H94jgQYM8GX7TqkbHsZL4lRLVekrLGvsUoIhNAGCFOKXdI41R-2FdpIP-2FcZP-2Bkll7zSX6qhyAbI-2BhpvzveN7FsSTXG7wtQ0f5obKWCAJmRgW-2BF279Fz-2BXwAyYO-2BDgU5Ux3z0nMd0Oxj-2BF0g9kBS6iCUOQrCqQHO5rwxz71Tg72zV14g-2FWbKwV9V-2Bpz60hdeL4Yj9SsjRrZBJTeRRn1ncqmPXZWsHq5Q1nkCUbFarHoE>>
>>> *gstein,*
>>> 
>>> We found a potential security vulnerability in a repository which you have
>>> been granted security alert access.
>>> [image: @apache] apache/incubator-spot
>>> <http://sgmail.githubmail.com/wf/click?upn=lYxq-2FYU7yocrdKNILYalBg5kFs28ucWJkBdd8Thfp20BdrR8TCONQc2kn5pucKDG_w6S5n3vrKqGS7A36Z0jQnv0H94jgQYM8GX7TqkbHsZL4lRLVekrLGvsUoIhNAGCFNqsg7wta17av-2FL0YAUtwssIKvIOLxgykpYL1GG8Cf-2FDtEy8HozRvfYZvwCNh0L4fUwB0hG7hob5ekkbrYDND0cxogI-2FwGoPycmiYYRJohy6r-2BgefjbcoxbDegvHwgqZQbR1QIn4mPCDA7F7e2xp6dInvAi6eIOn9wDYyowY94sc4WPHChVhA9T-2FatviMXQ5C
>>>  
>>> <http://sgmail.githubmail.com/wf/click?upn=lYxq-2FYU7yocrdKNILYalBg5kFs28ucWJkBdd8Thfp20BdrR8TCONQc2kn5pucKDG_w6S5n3vrKqGS7A36Z0jQnv0H94jgQYM8GX7TqkbHsZL4lRLVekrLGvsUoIhNAGCFNqsg7wta17av-2FL0YAUtwssIKvIOLxgykpYL1GG8Cf-2FDtEy8HozRvfYZvwCNh0L4fUwB0hG7hob5ekkbrYDND0cxogI-2FwGoPycmiYYRJohy6r-2BgefjbcoxbDegvHwgqZQbR1QIn4mPCDA7F7e2xp6dInvAi6eIOn9wDYyowY94sc4WPHChVhA9T-2FatviMXQ5C>>
>>> Known * moderate severity* security vulnerability detected in jquery < 3.0.0
>>> defined in package.json
>>> <http://sgmail.githubmail.com/wf/click?upn=lYxq-2FYU7yocrdKNILYalBg5kFs28ucWJkBdd8Thfp21gXpHKmHObT8WHTjVVKiQgQtZKOKCFJwe6y-2FnyqVctZ3JJeIyxf8pLRNasmiW-2FMivwRjAVPe4SAq-2Fq-2Fh3zlEeQ_w6S5n3vrKqGS7A36Z0jQnv0H94jgQYM8GX7TqkbHsZL4lRLVekrLGvsUoIhNAGCF1S9PT4ovvowBY2Ra

Re: [PODLING REPORT] February 2018

2018-02-07 Thread Nate Smith
Thanks for helping Brock,

You just need to send an email to gene...@incubator.apache.org asking for write 
access.
Please include your apache user ID so they can add you.
It is typically within a few hours that someone will grant you access.

- Nathanael

> On Feb 7, 2018, at 2:27 PM, Brock Noland <br...@apache.org> wrote:
> 
> Hi,
> 
> I sign off. For some reason I cannot edit the page (username brocknoland) can 
> someone check my name?
> 
> Brock
> 
> On Wed, Feb 7, 2018 at 11:13 AM, Nate Smith <natedogs...@gmail.com 
> <mailto:natedogs...@gmail.com>> wrote:
> The report has been uploaded,
> Mentors, Can you review please?
> 
> https://wiki.apache.org/incubator/February2018 
> <https://wiki.apache.org/incubator/February2018>
> 
> - Nathanael
> 
> > On Feb 2, 2018, at 1:35 PM, Nate Smith <natedogs...@gmail.com 
> > <mailto:natedogs...@gmail.com>> wrote:
> >
> > Well the work is done, and tested. It’s just sitting in a PR for review and 
> > merge.
> > If we can merge it before then I think it’s a good highlight but I might be 
> > a bit biased :)
> > There’s several advantages with that PR that can be mentioned, besides 
> > kerberos support we’re also using programmatic interfaces instead of Hadoop 
> > command line tools for most interactions.
> >
> > - Nathanael
> >
> >> On Feb 2, 2018, at 7:16 AM, Sam Heywood <sam.heyw...@cloudera.com 
> >> <mailto:sam.heyw...@cloudera.com>> wrote:
> >>
> >> Thanks for pulling this together. Is it premature to mention the kerberos
> >> work?
> >>
> >> --Sam
> >>
> >> On Wed, Jan 31, 2018 at 1:59 AM, Nate Smith <nathan...@apache.org 
> >> <mailto:nathan...@apache.org>> wrote:
> >>
> >>> Hello,
> >>>
> >>> Below I've included a draft of the podling report which is due Feb 7th.
> >>> Please feel free to request any additions/changes to the report.
> >>> I will post the latest version on the incubator wiki on Feb 6th.
> >>>
> >>> - Nathanael
> >>>
> >>> Draft of report below:
> >>> __
> >>>
> >>> Spot
> >>>
> >>>
> >>> Apache Spot is a platform for network telemetry built on an open data 
> >>> model
> >>>
> >>> and Apache Hadoop.
> >>>
> >>>
> >>> Spot has been incubating since 2016-09-23.
> >>>
> >>>
> >>> Three most important issues to address in the move towards graduation:
> >>>
> >>>
> >>> 1. Develop a better release process
> >>>
> >>> 2. Handle additional data types for ingestion and enrichment into ODM
> >>> schema
> >>>
> >>> 3. Fostering more activity in the user, dev and private mail lists
> >>>
> >>>
> >>> Any issues that the Incubator PMC (IPMC) or ASF Board wish/need to be
> >>>
> >>> aware of?
> >>>
> >>> a. Issues providing podling reports during December, January.
> >>> This is being addressed moving forward.
> >>>
> >>>
> >>> How has the community developed since the last report?
> >>>
> >>> a. Seeing more pull requests from new contributors
> >>>
> >>> How has the project developed since the last report?
> >>>
> >>> a. Development on the ODM branch has been moving forward and will continue
> >>> to push towards adoption into the master branch.
> >>> b. Ingest redesign underway
> >>>
> >>> How would you assess the podling's maturity?
> >>>
> >>> Please feel free to add your own commentary.
> >>>
> >>>
> >>> [ ] Initial setup
> >>>
> >>> [ ] Working towards first release
> >>>
> >>> [x ] Community building
> >>>
> >>> [ ] Nearing graduation
> >>>
> >>> [ ] Other:
> >>>
> >>>
> >>> Date of last release:
> >>>
> >>>
> >>> 2017-09-08
> >>>
> >>>
> >>> When were the last committers or PPMC members elected?
> >>>
> >>> 2018-01-18
> >>> __
> >>>
> >>
> >>
> >>
> >> --
> >> Sam Heywood
> >> Director Cybersecurity Strategy, Cloudera
> >> sam.heyw...@cloudera.com <mailto:sam.heyw...@cloudera.com> 
> >> <sam.heyw...@gazzang.com <mailto:sam.heyw...@gazzang.com>>
> >> M: (512) 716-9660 <tel:%28512%29%20716-9660>
> >
> 
> 



Re: [PODLING REPORT] February 2018

2018-02-07 Thread Nate Smith
The report has been uploaded,
Mentors, Can you review please?

https://wiki.apache.org/incubator/February2018

- Nathanael

> On Feb 2, 2018, at 1:35 PM, Nate Smith <natedogs...@gmail.com> wrote:
> 
> Well the work is done, and tested. It’s just sitting in a PR for review and 
> merge.
> If we can merge it before then I think it’s a good highlight but I might be a 
> bit biased :)
> There’s several advantages with that PR that can be mentioned, besides 
> kerberos support we’re also using programmatic interfaces instead of Hadoop 
> command line tools for most interactions.
> 
> - Nathanael
> 
>> On Feb 2, 2018, at 7:16 AM, Sam Heywood <sam.heyw...@cloudera.com> wrote:
>> 
>> Thanks for pulling this together. Is it premature to mention the kerberos
>> work?
>> 
>> --Sam
>> 
>> On Wed, Jan 31, 2018 at 1:59 AM, Nate Smith <nathan...@apache.org> wrote:
>> 
>>> Hello,
>>> 
>>> Below I've included a draft of the podling report which is due Feb 7th.
>>> Please feel free to request any additions/changes to the report.
>>> I will post the latest version on the incubator wiki on Feb 6th.
>>> 
>>> - Nathanael
>>> 
>>> Draft of report below:
>>> __
>>> 
>>> Spot
>>> 
>>> 
>>> Apache Spot is a platform for network telemetry built on an open data model
>>> 
>>> and Apache Hadoop.
>>> 
>>> 
>>> Spot has been incubating since 2016-09-23.
>>> 
>>> 
>>> Three most important issues to address in the move towards graduation:
>>> 
>>> 
>>> 1. Develop a better release process
>>> 
>>> 2. Handle additional data types for ingestion and enrichment into ODM
>>> schema
>>> 
>>> 3. Fostering more activity in the user, dev and private mail lists
>>> 
>>> 
>>> Any issues that the Incubator PMC (IPMC) or ASF Board wish/need to be
>>> 
>>> aware of?
>>> 
>>> a. Issues providing podling reports during December, January.
>>> This is being addressed moving forward.
>>> 
>>> 
>>> How has the community developed since the last report?
>>> 
>>> a. Seeing more pull requests from new contributors
>>> 
>>> How has the project developed since the last report?
>>> 
>>> a. Development on the ODM branch has been moving forward and will continue
>>> to push towards adoption into the master branch.
>>> b. Ingest redesign underway
>>> 
>>> How would you assess the podling's maturity?
>>> 
>>> Please feel free to add your own commentary.
>>> 
>>> 
>>> [ ] Initial setup
>>> 
>>> [ ] Working towards first release
>>> 
>>> [x ] Community building
>>> 
>>> [ ] Nearing graduation
>>> 
>>> [ ] Other:
>>> 
>>> 
>>> Date of last release:
>>> 
>>> 
>>> 2017-09-08
>>> 
>>> 
>>> When were the last committers or PPMC members elected?
>>> 
>>> 2018-01-18
>>> __
>>> 
>> 
>> 
>> 
>> -- 
>> Sam Heywood
>> Director Cybersecurity Strategy, Cloudera
>> sam.heyw...@cloudera.com <sam.heyw...@gazzang.com>
>> M: (512) 716-9660
> 



Re: [PODLING REPORT] February 2018

2018-02-02 Thread Nate Smith
Well the work is done, and tested. It’s just sitting in a PR for review and 
merge.
If we can merge it before then I think it’s a good highlight but I might be a 
bit biased :)
There’s several advantages with that PR that can be mentioned, besides kerberos 
support we’re also using programmatic interfaces instead of Hadoop command line 
tools for most interactions.

- Nathanael

> On Feb 2, 2018, at 7:16 AM, Sam Heywood <sam.heyw...@cloudera.com> wrote:
> 
> Thanks for pulling this together. Is it premature to mention the kerberos
> work?
> 
> --Sam
> 
> On Wed, Jan 31, 2018 at 1:59 AM, Nate Smith <nathan...@apache.org> wrote:
> 
>> Hello,
>> 
>> Below I've included a draft of the podling report which is due Feb 7th.
>> Please feel free to request any additions/changes to the report.
>> I will post the latest version on the incubator wiki on Feb 6th.
>> 
>> - Nathanael
>> 
>> Draft of report below:
>> __
>> 
>> Spot
>> 
>> 
>> Apache Spot is a platform for network telemetry built on an open data model
>> 
>> and Apache Hadoop.
>> 
>> 
>> Spot has been incubating since 2016-09-23.
>> 
>> 
>> Three most important issues to address in the move towards graduation:
>> 
>> 
>>  1. Develop a better release process
>> 
>>  2. Handle additional data types for ingestion and enrichment into ODM
>> schema
>> 
>>  3. Fostering more activity in the user, dev and private mail lists
>> 
>> 
>> Any issues that the Incubator PMC (IPMC) or ASF Board wish/need to be
>> 
>> aware of?
>> 
>> a. Issues providing podling reports during December, January.
>> This is being addressed moving forward.
>> 
>> 
>> How has the community developed since the last report?
>> 
>> a. Seeing more pull requests from new contributors
>> 
>> How has the project developed since the last report?
>> 
>> a. Development on the ODM branch has been moving forward and will continue
>> to push towards adoption into the master branch.
>> b. Ingest redesign underway
>> 
>> How would you assess the podling's maturity?
>> 
>> Please feel free to add your own commentary.
>> 
>> 
>>  [ ] Initial setup
>> 
>>  [ ] Working towards first release
>> 
>>  [x ] Community building
>> 
>>  [ ] Nearing graduation
>> 
>>  [ ] Other:
>> 
>> 
>> Date of last release:
>> 
>> 
>> 2017-09-08
>> 
>> 
>> When were the last committers or PPMC members elected?
>> 
>> 2018-01-18
>> __
>> 
> 
> 
> 
> -- 
> Sam Heywood
> Director Cybersecurity Strategy, Cloudera
> sam.heyw...@cloudera.com <sam.heyw...@gazzang.com>
> M: (512) 716-9660



[PODLING REPORT] February 2018

2018-01-31 Thread Nate Smith
Hello,

Below I've included a draft of the podling report which is due Feb 7th.
Please feel free to request any additions/changes to the report.
I will post the latest version on the incubator wiki on Feb 6th.

- Nathanael

Draft of report below:
__

Spot


Apache Spot is a platform for network telemetry built on an open data model

and Apache Hadoop.


Spot has been incubating since 2016-09-23.


Three most important issues to address in the move towards graduation:


  1. Develop a better release process

  2. Handle additional data types for ingestion and enrichment into ODM
schema

  3. Fostering more activity in the user, dev and private mail lists


Any issues that the Incubator PMC (IPMC) or ASF Board wish/need to be

aware of?

a. Issues providing podling reports during December, January.
This is being addressed moving forward.


How has the community developed since the last report?

a. Seeing more pull requests from new contributors

How has the project developed since the last report?

a. Development on the ODM branch has been moving forward and will continue
to push towards adoption into the master branch.
b. Ingest redesign underway

How would you assess the podling's maturity?

Please feel free to add your own commentary.


  [ ] Initial setup

  [ ] Working towards first release

  [x ] Community building

  [ ] Nearing graduation

  [ ] Other:


Date of last release:


 2017-09-08


When were the last committers or PPMC members elected?

2018-01-18
__


Re: Podling Report Reminder - December 2017

2017-12-14 Thread Nate Smith
Hi John, 

It seems that sign off was done on our draft of the document itself, and not 
the wiki by mistake.

I’ve sent a note, hopefully it will be resolved soon. 
Thank you for your patience,

- nathanael 

> On Dec 14, 2017, at 6:52 PM, John D. Ament <johndam...@apache.org> wrote:
> 
> Your report is missing mentor sign off.  Without sign off, it will be 
> rejected and you will be asked to report again next month.  Please mentors 
> review their report and sign off as appropriate.
> 
> John
> 
>> On 2017-12-07 18:17, Nate Smith <nathan...@apache.org> wrote: 
>> I haven’t seen any discussion of it.
>> 
>> I’m not sure of any major updates at this time.
>> 
>> I don’t think anyone else has write access to the wiki but Cesar and
>> Myself, I can upload the report when ready
>> 
>> FYI, we have to upload a doc, then a mentor signs off via the wiki page.
>> 
>> - Nathanael
>> 
>> On Dec 7, 2017, at 10:08 AM, Michael Ridley <mrid...@cloudera.com> wrote:
>> 
>> Hi Sam,
>> 
>> I haven't seen it on dev@ but it doesn't really need to be.  Just mentor
>> sign off and posted to the wiki.  There may have been discussion on private@
>> but I'm not sure since I'm not on that list since I'm not on the PMC.
>> 
>> Michael
>> 
>> On Thu, Dec 7, 2017 at 7:52 AM, Sam Heywood <sam.heyw...@cloudera.com>
>> wrote:
>> 
>> Michael
>> 
>> I don't think the doc has been shared with dev@ yet.
>> 
>> --Sam
>> 
>> On Tue, Dec 5, 2017 at 6:28 AM, Michael Ridley <mrid...@cloudera.com>
>> wrote:
>> 
>> I know Vartika and Nate were working on this last week.  Did this get
>> mentor review and submitted?
>> 
>> Michael
>> 
>> On Tue, Dec 5, 2017 at 6:38 AM, <johndam...@apache.org> wrote:
>> 
>> Dear podling,
>> 
>> This email was sent by an automated system on behalf of the Apache
>> Incubator PMC. It is an initial reminder to give you plenty of time to
>> prepare your quarterly board report.
>> 
>> The board meeting is scheduled for Wed, 20 December 2017, 10:30 am PDT.
>> The report for your podling will form a part of the Incubator PMC
>> report. The Incubator PMC requires your report to be submitted 2 weeks
>> before the board meeting, to allow sufficient time for review and
>> submission (Wed, December 06).
>> 
>> Please submit your report with sufficient time to allow the Incubator
>> PMC, and subsequently board members to review and digest. Again, the
>> very latest you should submit your report is 2 weeks prior to the board
>> meeting.
>> 
>> Thanks,
>> 
>> The Apache Incubator PMC
>> 
>> Submitting your Report
>> 
>> --
>> 
>> Your report should contain the following:
>> 
>> *   Your project name
>> *   A brief description of your project, which assumes no knowledge of
>>   the project or necessarily of its field
>> *   A list of the three most important issues to address in the move
>>   towards graduation.
>> *   Any issues that the Incubator PMC or ASF Board might wish/need to
>> 
>> be
>> 
>>   aware of
>> *   How has the community developed since the last report
>> *   How has the project developed since the last report.
>> *   How does the podling rate their own maturity.
>> 
>> This should be appended to the Incubator Wiki page at:
>> 
>> https://wiki.apache.org/incubator/December2017
>> 
>> Note: This is manually populated. You may need to wait a little before
>> this page is created from a template.
>> 
>> Mentors
>> ---
>> 
>> Mentors should review reports for their project(s) and sign them off on
>> the Incubator wiki page. Signing off reports shows that you are
>> following the project - projects that are not signed may raise alarms
>> for the Incubator PMC.
>> 
>> Incubator PMC
>> 
>> 
>> 
>> 
>> --
>> Michael Ridley <mrid...@cloudera.com>
>> office: (650) 352-1337
>> mobile: (571) 438-2420
>> Senior Solutions Architect
>> Cloudera
>> 
>> 
>> 
>> 
>> --
>> Sam Heywood
>> Director Cybersecurity Strategy, Cloudera
>> sam.heyw...@cloudera.com <sam.heyw...@gazzang.com>
>> M: (512) 716-9660
>> 
>> 
>> 
>> 
>> -- 
>> Michael Ridley <mrid...@cloudera.com>
>> office: (650) 352-1337
>> mobile: (571) 438-2420
>> Senior Solutions Architect
>> Cloudera
>> 


Re: [VOTE] New development branch for ingestion component

2017-11-28 Thread Nate Smith
Thanks Michael,

The vote has passed with:
1 non-binding
8 binding

we can certainly call it closed as enough time has elapsed.

I can create the epic if needed.
once we have the epic I'll create the branch.

- Nathanael

On Nov 28, 2017, at 12:16 PM, Michael Ridley <mrid...@cloudera.com> wrote:

Vartika,

Since you coordinated the design doc development, can you create the JIRA
Epic and related issues?

Thanks!

Michael

On Tue, Nov 28, 2017 at 3:15 PM, Michael Ridley <mrid...@cloudera.com>
wrote:

Hi team-

Wanted to follow-up on this, it looks like the vote passed.  Can we get
the branch created on the ASF git repo?

Thanks!

Michael

On Mon, Nov 27, 2017 at 2:51 PM, Michael Ridley <mrid...@cloudera.com>
wrote:

Hope all in the US had a great Thanksgiving holiday!  Wanted to follow-up
on this as it looks like the vote passed?

Michael

On Wed, Nov 22, 2017 at 12:53 PM, Lujan Moreno, Gustavo <
gustavo.lujan.mor...@intel.com> wrote:

+1

On 11/21/17, 5:53 PM, "Mubashir Kazia" <mka...@cloudera.com> wrote:

   +1

   On Nov 22, 2017 4:57 AM, "Nate Smith" <natedogs...@gmail.com> wrote:

   +1

On Nov 21, 2017, at 12:41 PM, Vartika Singh <vsi...@cloudera.com>

wrote:


+1

On Tue, Nov 21, 2017 at 12:21 PM, Cesar Berho <ce...@apache.org>

wrote:


+1

On Tue, Nov 21, 2017 at 12:26 PM, solrac...@apache.org <
solrac...@apache.org

wrote:


+1


2017-11-21 5:12 GMT-06:00 Nate Smith <nathan...@apache.org>:

Hello,

I’d like to call a vote in regards to the following:

Creation of an Ingest redesign branch (name to be determined)

and

linkage

of this branch to a Jira epic.

These are the proposed requirements:

1. Extendible to any source.
2. NRT latency between ingestion and output to the

HDFS/persistent

store.

3. Maintain the integrity of the ODM module in order to

facilitate

seamless integration with applications.

Please see the design spec for more details prior to voting:
https://docs.google.com/document/d/1yYaD50gp2HN9RHYaUG8ASDfh

3f__

GyB1Msu2LUofIU4/edit#

I will leave the vote open for 3 days,

- Nathanael

On Nov 17, 2017, at 2:28 PM, Mubashir Kazia <

mka...@cloudera.com>

wrote:


+1 for breaking down into tasks.

Let's resolve the comments in the doc and put it up for vote.

Looks

good

to

me otherwise.

Thanks Vartika for working on this.

On Wed, Nov 8, 2017 at 10:36 PM, Michael Ridley <

mrid...@cloudera.com>

wrote:

I just realized that in my re-writing and editing of the last

email I

took

out the part where I said that I thought it looks good.  But I

do think

it

looks good :-)

Michael

On Wed, Nov 8, 2017 at 10:35 PM, Michael Ridley <

mrid...@cloudera.com>

wrote:

Hi Vartika-

Thanks for the reminder about the ingestion design document.

Suggest

we

open a JIRA (unless one has already been opened and I missed

it) with

the

specific tasks broken down (create an Epic if necessary) and

attach a

PDF

version of the architecture document.  Would also be great to

get this

document into the Spot repo.

Do we need a separate upstream branch?  I don't think there has

been

much

development process discussion here but I would expect the work

to be

in

the main dev branch after the patches/PRs are accepted and the

JIRAs

are

resolved.

Michael

On Tue, Oct 31, 2017 at 11:31 PM, Vartika Singh <

vsi...@cloudera.com>

wrote:

Hello all,

The following document has been out for discussion for a while:

https://docs.google.com/document/d/1yYaD50gp2HN9RHYaUG8ASDfh

3f__

GyB1Msu2LUofIU4/edit#heading=h.ptz51qlqt0t

I would like to start a new branch for ingestion component. A

Again, the idea of ingestion component is the following:

1. Extendible to any source.
2. NRT latency between ingestion and output to the

HDFS/persistent

store.
3. Maintain the integrity of the ODM module in order to

facilitate

seamless integration with applications.


Thoughts?

Vartika Singh




--
Michael Ridley <mrid...@cloudera.com>
office: (650) 352-1337
mobile: (571) 438-2420
Senior Solutions Architect
Cloudera




--
Michael Ridley <mrid...@cloudera.com>
office: (650) 352-1337
mobile: (571) 438-2420
Senior Solutions Architect
Cloudera






--
Vartika Singh
Cloudera






--
Michael Ridley <mrid...@cloudera.com>
office: (650) 352-1337
mobile: (571) 438-2420
Senior Solutions Architect
Cloudera




--
Michael Ridley <mrid...@cloudera.com>
office: (650) 352-1337
mobile: (571) 438-2420
Senior Solutions Architect
Cloudera




-- 
Michael Ridley <mrid...@cloudera.com>
office: (650) 352-1337
mobile: (571) 438-2420
Senior Solutions Architect
Cloudera


Re: [VOTE] New development branch for ingestion component

2017-11-21 Thread Nate Smith
+1

> On Nov 21, 2017, at 12:41 PM, Vartika Singh <vsi...@cloudera.com> wrote:
> 
> +1
> 
> On Tue, Nov 21, 2017 at 12:21 PM, Cesar Berho <ce...@apache.org> wrote:
> 
>> +1
>> 
>> On Tue, Nov 21, 2017 at 12:26 PM, solrac...@apache.org <
>> solrac...@apache.org
>>> wrote:
>> 
>>> +1
>>> 
>>> 
>>> 2017-11-21 5:12 GMT-06:00 Nate Smith <nathan...@apache.org>:
>>> 
>>>> Hello,
>>>> 
>>>> I’d like to call a vote in regards to the following:
>>>> 
>>>> Creation of an Ingest redesign branch (name to be determined) and
>> linkage
>>>> of this branch to a Jira epic.
>>>> 
>>>> These are the proposed requirements:
>>>> 
>>>>  1. Extendible to any source.
>>>>  2. NRT latency between ingestion and output to the HDFS/persistent
>>> store.
>>>>  3. Maintain the integrity of the ODM module in order to facilitate
>>>>  seamless integration with applications.
>>>> 
>>>> Please see the design spec for more details prior to voting:
>>>> https://docs.google.com/document/d/1yYaD50gp2HN9RHYaUG8ASDfh3f__
>>>> GyB1Msu2LUofIU4/edit#
>>>> 
>>>> I will leave the vote open for 3 days,
>>>> 
>>>> - Nathanael
>>>> 
>>>> On Nov 17, 2017, at 2:28 PM, Mubashir Kazia <mka...@cloudera.com>
>> wrote:
>>>> 
>>>> +1 for breaking down into tasks.
>>>> 
>>>> Let's resolve the comments in the doc and put it up for vote. Looks
>> good
>>> to
>>>> me otherwise.
>>>> 
>>>> Thanks Vartika for working on this.
>>>> 
>>>> On Wed, Nov 8, 2017 at 10:36 PM, Michael Ridley <mrid...@cloudera.com>
>>>> wrote:
>>>> 
>>>> I just realized that in my re-writing and editing of the last email I
>>> took
>>>> out the part where I said that I thought it looks good.  But I do think
>>> it
>>>> looks good :-)
>>>> 
>>>> Michael
>>>> 
>>>> On Wed, Nov 8, 2017 at 10:35 PM, Michael Ridley <mrid...@cloudera.com>
>>>> wrote:
>>>> 
>>>> Hi Vartika-
>>>> 
>>>> Thanks for the reminder about the ingestion design document.  Suggest
>> we
>>>> open a JIRA (unless one has already been opened and I missed it) with
>> the
>>>> specific tasks broken down (create an Epic if necessary) and attach a
>> PDF
>>>> version of the architecture document.  Would also be great to get this
>>>> document into the Spot repo.
>>>> 
>>>> Do we need a separate upstream branch?  I don't think there has been
>> much
>>>> development process discussion here but I would expect the work to be
>> in
>>>> the main dev branch after the patches/PRs are accepted and the JIRAs
>> are
>>>> resolved.
>>>> 
>>>> Michael
>>>> 
>>>> On Tue, Oct 31, 2017 at 11:31 PM, Vartika Singh <vsi...@cloudera.com>
>>>> wrote:
>>>> 
>>>> Hello all,
>>>> 
>>>> The following document has been out for discussion for a while:
>>>> 
>>>> https://docs.google.com/document/d/1yYaD50gp2HN9RHYaUG8ASDfh3f__
>>>> GyB1Msu2LUofIU4/edit#heading=h.ptz51qlqt0t
>>>> 
>>>> I would like to start a new branch for ingestion component. A
>>>> 
>>>> Again, the idea of ingestion component is the following:
>>>> 
>>>>  1. Extendible to any source.
>>>>  2. NRT latency between ingestion and output to the HDFS/persistent
>>>> store.
>>>>  3. Maintain the integrity of the ODM module in order to facilitate
>>>>  seamless integration with applications.
>>>> 
>>>> 
>>>> Thoughts?
>>>> 
>>>> Vartika Singh
>>>> 
>>>> 
>>>> 
>>>> 
>>>> --
>>>> Michael Ridley <mrid...@cloudera.com>
>>>> office: (650) 352-1337
>>>> mobile: (571) 438-2420
>>>> Senior Solutions Architect
>>>> Cloudera
>>>> 
>>>> 
>>>> 
>>>> 
>>>> --
>>>> Michael Ridley <mrid...@cloudera.com>
>>>> office: (650) 352-1337
>>>> mobile: (571) 438-2420
>>>> Senior Solutions Architect
>>>> Cloudera
>>>> 
>>> 
>> 
> 
> 
> 
> -- 
> Vartika Singh
> Cloudera



[VOTE] New development branch for ingestion component

2017-11-21 Thread Nate Smith
Hello,

I’d like to call a vote in regards to the following:

Creation of an Ingest redesign branch (name to be determined) and linkage
of this branch to a Jira epic.

These are the proposed requirements:

  1. Extendible to any source.
  2. NRT latency between ingestion and output to the HDFS/persistent store.
  3. Maintain the integrity of the ODM module in order to facilitate
  seamless integration with applications.

Please see the design spec for more details prior to voting:
https://docs.google.com/document/d/1yYaD50gp2HN9RHYaUG8ASDfh3f__
GyB1Msu2LUofIU4/edit#

I will leave the vote open for 3 days,

- Nathanael

On Nov 17, 2017, at 2:28 PM, Mubashir Kazia  wrote:

+1 for breaking down into tasks.

Let's resolve the comments in the doc and put it up for vote. Looks good to
me otherwise.

Thanks Vartika for working on this.

On Wed, Nov 8, 2017 at 10:36 PM, Michael Ridley 
wrote:

I just realized that in my re-writing and editing of the last email I took
out the part where I said that I thought it looks good.  But I do think it
looks good :-)

Michael

On Wed, Nov 8, 2017 at 10:35 PM, Michael Ridley 
wrote:

Hi Vartika-

Thanks for the reminder about the ingestion design document.  Suggest we
open a JIRA (unless one has already been opened and I missed it) with the
specific tasks broken down (create an Epic if necessary) and attach a PDF
version of the architecture document.  Would also be great to get this
document into the Spot repo.

Do we need a separate upstream branch?  I don't think there has been much
development process discussion here but I would expect the work to be in
the main dev branch after the patches/PRs are accepted and the JIRAs are
resolved.

Michael

On Tue, Oct 31, 2017 at 11:31 PM, Vartika Singh 
wrote:

Hello all,

The following document has been out for discussion for a while:

https://docs.google.com/document/d/1yYaD50gp2HN9RHYaUG8ASDfh3f__
GyB1Msu2LUofIU4/edit#heading=h.ptz51qlqt0t

I would like to start a new branch for ingestion component. A

Again, the idea of ingestion component is the following:

  1. Extendible to any source.
  2. NRT latency between ingestion and output to the HDFS/persistent
store.
  3. Maintain the integrity of the ODM module in order to facilitate
  seamless integration with applications.


Thoughts?

Vartika Singh




--
Michael Ridley 
office: (650) 352-1337
mobile: (571) 438-2420
Senior Solutions Architect
Cloudera




--
Michael Ridley 
office: (650) 352-1337
mobile: (571) 438-2420
Senior Solutions Architect
Cloudera


Re: [Proposal] PR voting process changes

2017-10-18 Thread Nate Smith
I'm sorry it looks like my last email didn't go to @dev.

Do need to have a more structured vote on this?
I did not see any negative opinions, only a few points on the allotted time
and revisiting at a more "mature" point in the future.

Let me know,

- Nathanael

On Fri, Sep 29, 2017 at 11:54 AM, Nate Smith <natedogs...@gmail.com> wrote:

> Bump,
>
> Do we need to take an official vote on this?
>
> +1 from me of course on the change, and it seems that we're all in
> agreement.
>
> On Fri, Sep 22, 2017 at 1:59 PM, Cesar Berho <ce...@apache.org> wrote:
>
>> +1  on the 48 hrs period.
>>
>> On Fri, Sep 22, 2017 at 10:15 AM, Gonzalez, Victor <
>> victor.gonza...@intel.com> wrote:
>>
>>> +1 with 48 hours period
>>>
>>> Sent from my iPhone
>>>
>>> > On Sep 21, 2017, at 3:52 PM, Jon Zeolla <jonzeo...@apache.org> wrote:
>>> >
>>> > I agree, at least one +1 from a committer as a minimum bar is pretty
>>> > reasonable.  For bigger changes usually having more people review and
>>> test
>>> > makes sense, but I've seen that handled as more of a one off.
>>> >
>>> > I'm usually in favor of a 24 hour wait as well, but could see it go
>>> either
>>> > way here.
>>> >
>>> > Jon
>>> >
>>> >> On Thu, Sep 21, 2017, 16:44 <jar...@apache.org> wrote:
>>> >>
>>> >> I would recommend to make contributing to Spot as easily as possible
>>> >> because any hurdle or obstacle will make contributing harder and thus
>>> will
>>> >> discourage potential long term contributors.
>>> >>
>>> >> Pretty much all other projects that I’m involved with at ASF are
>>> following
>>> >> something in the lines of what Nate is describing. Anyone on the
>>> internet
>>> >> can submit a patch and all it takes is a single committer who does
>>> review
>>> >> and then the patch is merged to master branch. Some projects do a
>>> “cool
>>> >> off" window before the “review” and “merge” to make sure that other
>>> >> committers have time to jump in - projects like Hadoop and Hive tend
>>> to
>>> >> give 24 hours, projects like Sqoop or Flume simply commit
>>> immediately. Any
>>> >> other committer however have always a chance to jump in and pretty
>>> much
>>> >> VETO the patch — provided there is a good explanation for the push
>>> back.
>>> >>
>>> >> Jarcec
>>> >>
>>> >>> On Sep 21, 2017, at 1:15 PM, Michael Ridley <mrid...@cloudera.com>
>>> >> wrote:
>>> >>>
>>> >>> Sounds like a good approach.  I'm all in favor of following a process
>>> >> that
>>> >>> works for other ASF projects.
>>> >>>
>>> >>> Speaking of votes by committer, I think any vote would be recorded as
>>> >>> binding or non-binding based on committer status.  I am not a
>>> committer
>>> >> so
>>> >>> I always make sure to mark mine as non-binding.
>>> >>>
>>> >>> Michael
>>> >>>
>>> >>> On Thu, Sep 21, 2017 at 1:28 PM, Nate Smith <natedogs...@gmail.com>
>>> >> wrote:
>>> >>>
>>> >>>> Also,
>>> >>>>
>>> >>>> As a point of consideration it's good to highlight that in such a
>>> >> scenario
>>> >>>> where a +1 is given and 48 hours to review prior to merge, any -1
>>> should
>>> >>>> reset the vote in my mind. Votes of such nature would have to be
>>> >> restricted
>>> >>>> to committers on the project.
>>> >>>>
>>> >>>> On Thu, Sep 21, 2017 at 10:22 AM, Nate Smith <nathan...@apache.org>
>>> >> wrote:
>>> >>>>
>>> >>>>> Hello,
>>> >>>>>
>>> >>>>> From my own experience and also in talking directly with a few
>>> >> committers
>>> >>>>> to the project the requirement for three +1's from committers
>>> should be
>>> >>>>> reviewed.
>>> >>>>>
>>> >>>>> My understanding is that other projects in the ASF simply require
>>> one
>>> >>>> vote
>>> >>>>> and provide some time for review by others prior to merging (such
>>> as a
>>> >>>>> 24-48 hour period). However more emphasis is placed on refining
>>> code in
>>> >>>>> preparation for releases.
>>> >>>>>
>>> >>>>> As it stands today we require at least three +1's before merge, and
>>> >> there
>>> >>>>> is no time requirement.
>>> >>>>>
>>> >>>>> Since we are a growing community, and the goal is to develop more
>>> code
>>> >>>>> contributors I think it is important to bring this up for review in
>>> >> hopes
>>> >>>>> that we can adopt something that allows faster iterations with a
>>> strong
>>> >>>>> focus on polishing for future releases.
>>> >>>>>
>>> >>>>> - Nathanael
>>> >>>>>
>>> >>>>
>>> >>>
>>> >>>
>>> >>>
>>> >>> --
>>> >>> Michael Ridley <mrid...@cloudera.com>
>>> >>> office: (650) 352-1337
>>> >>> mobile: (571) 438-2420
>>> >>> Senior Solutions Architect
>>> >>> Cloudera
>>> >>
>>> >> --
>>> >
>>> > Jon
>>>
>>
>>
>


Re: [Proposal] PR voting process changes

2017-09-21 Thread Nate Smith
Also,

As a point of consideration it's good to highlight that in such a scenario
where a +1 is given and 48 hours to review prior to merge, any -1 should
reset the vote in my mind. Votes of such nature would have to be restricted
to committers on the project.

On Thu, Sep 21, 2017 at 10:22 AM, Nate Smith <nathan...@apache.org> wrote:

> Hello,
>
> From my own experience and also in talking directly with a few committers
> to the project the requirement for three +1's from committers should be
> reviewed.
>
> My understanding is that other projects in the ASF simply require one vote
> and provide some time for review by others prior to merging (such as a
> 24-48 hour period). However more emphasis is placed on refining code in
> preparation for releases.
>
> As it stands today we require at least three +1's before merge, and there
> is no time requirement.
>
> Since we are a growing community, and the goal is to develop more code
> contributors I think it is important to bring this up for review in hopes
> that we can adopt something that allows faster iterations with a strong
> focus on polishing for future releases.
>
> - Nathanael
>


[Proposal] PR voting process changes

2017-09-21 Thread Nate Smith
Hello,

>From my own experience and also in talking directly with a few committers
to the project the requirement for three +1's from committers should be
reviewed.

My understanding is that other projects in the ASF simply require one vote
and provide some time for review by others prior to merging (such as a
24-48 hour period). However more emphasis is placed on refining code in
preparation for releases.

As it stands today we require at least three +1's before merge, and there
is no time requirement.

Since we are a growing community, and the goal is to develop more code
contributors I think it is important to bring this up for review in hopes
that we can adopt something that allows faster iterations with a strong
focus on polishing for future releases.

- Nathanael


Re: [Discuss] - Future plans for Spot-ingest

2017-04-13 Thread Nate Smith
I was really hoping it came through ok,
Oh well :)
Here’s an image form:
http://imgur.com/a/DUDsD


> On Apr 13, 2017, at 4:05 PM, Segerlind, Nathan L 
>  wrote:
> 
> The diagram became garbled in the text format.
> Could you resend it as a pdf?
> 
> Thanks,
> Nate
> 
> -Original Message-
> From: Nathanael Smith [mailto:nathan...@apache.org] 
> Sent: Thursday, April 13, 2017 4:01 PM
> To: priv...@spot.incubator.apache.org; dev@spot.incubator.apache.org; 
> u...@spot.incubator.apache.org
> Subject: [Discuss] - Future plans for Spot-ingest
> 
> How would you like to see Spot-ingest change?
> 
> A. continue development on the Python Master/Worker with focus on performance 
> / error handling / logging B. Develop Scala based ingest to be inline with 
> code base from ingest, ml, to OA (UI to continue being ipython/JS) C. Python 
> ingest Worker with Scala based Spark code for normalization and input into DB
> 
> Including the high level diagram:
> +--+
> | +--+  
> +-+|
> | | Master   |  A. B. C.| Worker  
> ||
> | |A. Python +---+  A.  |   A. Python 
> ||
> | |B. Scala  |   |+-> 
> ++   |
> | |C. Python |   || | 
> ||   |
> | +---^--+---+   || 
> +-+|   |
> | |  |   ||   
>  |   |
> | |  |   ||   
>  |   |
> | | +Note--+ || 
> +-+|   |
> | | |Running on a  | || | Spark 
> Streaming ||   |
> | | |worker node in| ||  B. C.  | B. Scala
> ||   |
> | | |the Hadoop cluster| ||+> C. Scala
> +-+  |   |
> | | +--+ |||| 
> | |  |   |
> |   A.|  |||
> +-+ |  |   |
> |   B.|  |||  
>   |  |   |
> |   C.|  |||  
>   |  |   |
> | +--+  +-v--+++-+   
> +--v--v-+ |
> | |  |  ||   |
>| |
> | |   Local FS:  |  |hdfs|   |  Hive 
> / Impala| |
> | |  - Binary/Text   |  ||   |   - 
> Parquet - | |
> | |Log files -   |  ||   |
>| |
> | |  |  ||   |
>| |
> | +--+  ++   
> +---+ |
> +--+
> 
> Please let me know your thoughts,
> 
> - Nathanael
> 
> 
>