Re: not able to launch dataflow job - permission issues with service account

2018-11-22 Thread Ankur Chauhan
As the message says, enable the api from cloud console of gcloud cli
utility. (Refer dataflow docs)

If that’s enabled, ensure the robot account has the permissions it needs -
dataflow api, gcs and anything else you are trying to access.

On Thu, Nov 22, 2018 at 04:33 Unais T  wrote:

> I am trying to run a simple data flow job in google cloud - its running
> perfectly in local - but when I tried to launch it I'm getting the
> following error - I tried debug a lot
>
> can someone help on this
>
> INFO:root:Created job with id: [2018-11-22_02_57_07-12079060901530487381]
> INFO:root:To access the Dataflow monitoring console, please navigate to 
> https://console.cloud.google.com/dataflow/jobsDetail/locations/us-central1/jobs/2018-11-22_02_57_07-12079060901530487381?project=kkfas-main-account-221808
> INFO:root:Job 2018-11-22_02_57_07-12079060901530487381 is in state 
> JOB_STATE_PENDING
> INFO:root:2018-11-22T10:57:08.052Z: JOB_MESSAGE_DETAILED: Autoscaling is 
> enabled for job 2018-11-22_02_57_07-12079060901530487381. The number of 
> workers will be between 1 and 1000.
> INFO:root:2018-11-22T10:57:08.072Z: JOB_MESSAGE_DETAILED: Autoscaling was 
> automatically enabled for job 2018-11-22_02_57_07-12079060901530487381.
> INFO:root:2018-11-22T10:57:40.405Z: JOB_MESSAGE_ERROR: Workflow failed. 
> Causes: There was a problem refreshing your credentials. Please check:
> 1. Dataflow API is enabled for your project.
> 2. There is a robot service account for your project:
> service-[project 
> number]@dataflow-service-producer-prod.iam.gserviceaccount.com should have 
> access to your project. If this account does not appear in the permissions 
> tab for your project, contact Dataflow support.
> INFO:root:Job 2018-11-22_02_57_07-12079060901530487381 is in state 
> JOB_STATE_FAILED
> Traceback (most recent call last):
>   File 
> "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/runpy.py",
>  line 162, in _run_module_as_main
> "__main__", fname, loader, pkg_name)
>   File 
> "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/runpy.py",
>  line 72, in _run_code
> exec code in run_globals
>   File "/Users/u/Projects//digital/test.py", line 49, in 
> run()
>   File "/Users/u/Projects//dataflow//digital/test.py", line 44, in run
> return p.run().wait_until_finish()
>   File 
> "/Users/u/VirtualEnv/dataflow/lib/python2.7/site-packages/apache_beam/runners/dataflow/dataflow_runner.py",
>  line 1122, in wait_until_finish
> (self.state, getattr(self._runner, 'last_error_msg', None)), self)
> apache_beam.runners.dataflow.dataflow_runner.DataflowRuntimeException: 
> Dataflow pipeline failed. State: FAILED, Error:
> Workflow failed. Causes: There was a problem refreshing your credentials. 
> Please check:
> 1. Dataflow API is enabled for your project.
> 2. There is a robot service account for your project:
> service-[project 
> number]@dataflow-service-producer-prod.iam.gserviceaccount.com should have 
> access to your project. If this account does not appear in the permissions 
> tab for your project, contact Dataflow support.
>
>


Re: [VOTE] Donating the Dataflow Worker code to Apache Beam

2018-09-17 Thread Ankur Chauhan
+1

Sent from my iPhone

> On Sep 17, 2018, at 15:26, Ankur Goenka  wrote:
> 
> +1
> 
>> On Sun, Sep 16, 2018 at 3:20 AM Maximilian Michels  wrote:
>> +1 (binding)
>> 
>> On 15.09.18 20:07, Reuven Lax wrote:
>> > +1
>> > 
>> > On Sat, Sep 15, 2018 at 9:40 AM Rui Wang > > > wrote:
>> > 
>> > +1
>> > 
>> > -Rui
>> > 
>> > On Sat, Sep 15, 2018 at 12:32 AM Robert Bradshaw
>> > mailto:rober...@google.com>> wrote:
>> > 
>> > +1 (binding)
>> > 
>> > On Sat, Sep 15, 2018 at 6:44 AM Tim > > > wrote:
>> > 
>> > +1
>> > 
>> > On 15 Sep 2018, at 01:23, Yifan Zou > > > wrote:
>> > 
>> >> +1
>> >>
>> >> On Fri, Sep 14, 2018 at 4:20 PM David Morávek
>> >> mailto:david.mora...@gmail.com>>
>> >> wrote:
>> >>
>> >> +1
>> >>
>> >>
>> >>
>> >> On 15 Sep 2018, at 00:59, Anton Kedin
>> >> mailto:ke...@google.com>> wrote:
>> >>
>> >>> +1
>> >>>
>> >>> On Fri, Sep 14, 2018 at 3:22 PM Alan Myrvold
>> >>> mailto:amyrv...@google.com>> wrote:
>> >>>
>> >>> +1
>> >>>
>> >>> On Fri, Sep 14, 2018 at 3:16 PM Boyuan Zhang
>> >>> mailto:boyu...@google.com>>
>> >>> wrote:
>> >>>
>> >>> +1
>> >>>
>> >>> On Fri, Sep 14, 2018 at 3:15 PM Henning Rohde
>> >>> > >>> > wrote:
>> >>>
>> >>> +1
>> >>>
>> >>> On Fri, Sep 14, 2018 at 2:40 PM Ahmet
>> >>> Altay > >>> > wrote:
>> >>>
>> >>> +1 (binding)
>> >>>
>> >>> On Fri, Sep 14, 2018 at 2:35 PM,
>> >>> Lukasz Cwik > >>> > wrote:
>> >>>
>> >>> +1 (binding)
>> >>>
>> >>> On Fri, Sep 14, 2018 at 2:34 PM
>> >>> Pablo Estrada > >>> > wrote:
>> >>>
>> >>> +1
>> >>>
>> >>> On Fri, Sep 14, 2018 at 2:32
>> >>> PM Andrew Pilloud
>> >>> > >>> >
>> >>> wrote:
>> >>>
>> >>> +1
>> >>>
>> >>> On Fri, Sep 14, 2018 at
>> >>> 2:31 PM Lukasz Cwik
>> >>> > >>> > 
>> >>> wrote:
>> >>>
>> >>> There was generally
>> >>> positive support and
>> >>> good feedback[1] but
>> >>> it was not unanimous.
>> >>> I wanted to bring the
>> >>> donation of the
>> >>> Dataflow worker code
>> >>> base to Apache Beam
>> >>> master to a vote.
>> >>>
>> >>> +1: Support having
>> >>> the Dataflow worker
>> >>> code as part of
>> >>> Apache Beam master branch
>> >>> -1: Dataflow worker
>> >>> code should live
>> >>> elsewhere
>> >>>
>> >>> 1:
>> >>> 
>> >>> https://lists.apache.org/thread.html/89efd3bc1d30f3d43d4b361a5ee05bd52778c9dc3f43ac72354c2bd9@%3Cdev.beam.apache.org%3E
>> >>>
>> >>>


Re: rename: BeamRecord -> Row

2018-02-02 Thread Ankur Chauhan
++

On Fri, Feb 2, 2018 at 1:33 PM Rafael Fernandez  wrote:

> Very strong +1
>
>
> On Fri, Feb 2, 2018 at 1:24 PM Reuven Lax  wrote:
>
>> We're looking at renaming the BeamRecord class
>> , that was used for columnar
>> data. There was sufficient discussion on the naming, that I want to make
>> sure the dev list is aware of naming plans here.
>>
>> BeamRecord is a columnar, field-based record. Currently it's used by
>> BeamSQL, and the plan is to use it for schemas as well. "Record" is a
>> confusing name for this class, as all elements in the Beam model are
>> referred to as "records," whether or not they have schemas. "Row" is a much
>> clearer name.
>>
>> There was a lot of discussion whether to name this BeamRow or just plain
>> Row (in the org.apache.beam.values namespace). The argument in favor of
>> BeamRow was so that people aren't forced to qualify their type names in the
>> case of a conflict with a Row from another package. The argument in favor
>> of Row was that it's a better name, it's in the Beam namespace anyway, and
>> it's what the rest of the world (Cassandra, Hive, Spark, etc.) calls
>> similar classes.
>>
>> RIght not consensus on the PR is leaning to Row. If you feel strongly,
>> please speak up :)
>>
>> Reuven
>>
>


Unable to use MapState with DoFnTester

2018-02-02 Thread Ankur Chauhan
Hi,

I am trying to write some tests for a DoFn that has one StateSpecs.map()
type declared. When I run this test I get this error.

java.lang.UnsupportedOperationException: Parameter
StateParameter{referent=StateDeclaration{id=indexKeys, field=private final
org.apache.beam.sdk.state.StateSpec
com.brightcove.rna.transforms.functions.GenerateMutationsFn.INDEX_KEYS_SPEC,
stateType=org.apache.beam.sdk.state.MapState<java.lang.String,
com.google.protobuf.ByteString>}} not supported by DoFnTester

at
org.apache.beam.sdk.transforms.DoFnTester$5.dispatchDefault(DoFnTester.java:720)
at
org.apache.beam.sdk.transforms.DoFnTester$5.dispatchDefault(DoFnTester.java:705)
at
org.apache.beam.sdk.transforms.reflect.DoFnSignature$Parameter$Cases$WithDefault.dispatch(DoFnSignature.java:260)
at
org.apache.beam.sdk.transforms.reflect.DoFnSignature$Parameter.match(DoFnSignature.java:195)
at org.apache.beam.sdk.transforms.DoFnTester.(DoFnTester.java:704)
at org.apache.beam.sdk.transforms.DoFnTester.of(DoFnTester.java:92)
at
com.brightcove.rna.transforms.functions.GenerateMutationsFnTest.testDefaultTimestamp(GenerateMutationsFnTest.java:42)

Is MapState supported with DoFnTester?

-- Ankur Chauhan


Re: [PROPOSAL] Beam Go SDK feature branch

2017-12-02 Thread Ankur Chauhan
Hi

This is amazing. Having used Java ask for over two years now and recently
transitioned to writing go for a bunch of microservices, I really like the
simplicity. I used gleam to do data processing and loved it. That said I
would much rather prefer the beam sdk paradigms though.


Ankur


On Fri, Dec 1, 2017 at 14:39 Kenneth Knowles  wrote:

> This is awesome!
>
> With three SDKs the Beam portability vision will have a lot to offer.
>
> Kenn
>
> On Fri, Dec 1, 2017 at 10:49 AM, Robert Bradshaw 
> wrote:
>
>> Very Exciting!
>>
>> The Go language is in a very different point of the design space than
>> either Java or Python; it's interesting to see how you've explored
>> making this fit with the Beam model. Thanks for the detailed design
>> doc.
>>
>> +1 to targeting the portability framework directly. Once all runners
>> are upgraded to use this too it'll just work everywhere.
>>
>> On Thu, Nov 30, 2017 at 8:51 PM, Jean-Baptiste Onofré 
>> wrote:
>> > Hi Henning,
>> >
>> > Thanks for the update, that's great !
>> >
>> > Regards
>> > JB
>> >
>> > On 12/01/2017 12:40 AM, Henning Rohde wrote:
>> >>
>> >> Hi everyone,
>> >>
>> >>   We have been prototyping a Go SDK for Beam for some time and have
>> >> reached a point, where we think this effort might be of interest to the
>> >> wider Beam community and would benefit from being developed in a proper
>> >> feature branch. We have prepared a PR to that end:
>> >>
>> >> https://github.com/apache/beam/pull/4200
>> >>
>> >> Please note that the prototype supports batch only, for now, and
>> includes
>> >> various examples that run on the Go direct runner. Go would be the
>> first SDK
>> >> targeting the portability framework exclusively and our plan is to
>> extend
>> >> and benefit from that ecosystem.
>> >>
>> >> We have also prepared an RFC document with the initial design,
>> motivations
>> >> and tradeoffs made:
>> >>
>> >> https://s.apache.org/beam-go-sdk-design-rfc
>> >> 
>> >>
>> >> The challenge is that Go is quite a tricky language for Beam due to
>> >> various limitations, notably strong typing w/o generics, and so the
>> >> approaches taken by Java and Python do not readily apply.
>> >>
>> >> Of course, neither the prototype nor the design are in any way final --
>> >> there are many open questions and we absolutely welcome ideas and
>> >> contributions. Please let us know if you have any comments or
>> objections (or
>> >> would like to help!).
>> >>
>> >> Thanks,
>> >>   Henning Rohde, Bill Neubauer, and Robert Burke
>> >
>> >
>> > --
>> > Jean-Baptiste Onofré
>> > jbono...@apache.org
>> > http://blog.nanthrax.net
>> > Talend - http://www.talend.com
>>
>
> --
Sent from lazarus


[grpc-io] Minimum and recommended pod requirements

2017-09-27 Thread Ankur Chauhan
Hi

I'm running a grpc-java application in a kubernetes cluster on gke. This 
application listens for request from the user, reads google cloud bigtable over 
grpc using the official bigtable client and trouble with the rows scanned with 
some simple parsing. It seems like whenever I run the server process on my 
laptop with 8 cores and 4gb heap everything is really fast but on the 
kubernetes cluster with resource sizes at 4 GB and 2 cores I consistently see 
very high latencies in addition to long pauses in RPC. 

Is there any guidance on what size pods I should run a grpc container. What 
should I look for in terms of monitoring?

-- 
You received this message because you are subscribed to the Google Groups 
"grpc.io" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to grpc-io+unsubscr...@googlegroups.com.
To post to this group, send email to grpc-io@googlegroups.com.
Visit this group at https://groups.google.com/group/grpc-io.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/grpc-io/81ee34e0-8481-47db-8536-9f467aa5b530%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [grpc-io] [grpc-java] Header propagation in service mesh (linkerd)

2017-07-30 Thread Ankur Chauhan
Published a cleaned out version of these interceptors at 
- https://github.com/ankurcha/linkerd-grpc-interceptors

On Sunday, July 30, 2017 at 12:42:37 AM UTC-7, Ankur Chauhan wrote:
>
> Thanks for the code example Spencer. This is exactly what I needed. I'll 
> make it into a library and publish it for others to enjoy once i test it a 
> little bit.
>
> On Friday, July 28, 2017 at 10:41:34 AM UTC-7, Spencer Fang wrote:
>>
>> In gist form for easier mailing list archival, because I'd probably clean 
>> up that branch at some point:
>> https://gist.github.com/zpencer/e581ae1b55974427512d1ad8f157e323
>>
>> On Fri, Jul 28, 2017 at 10:28 AM, Spencer Fang <spenc...@google.com> 
>> wrote:
>>
>>> Hello Ankur,
>>> The io.grpc.Context is indeed the right thing to use here. What you want 
>>> to do is to first create a ServerInterceptor that looks at the incoming 
>>> headers and pulls out all "l5d-" prefixed keys. The ServerCall.Listener is 
>>> how your application defines how to respond to incoming requests, so the 
>>> interceptor must make sure that a io.grpc.Context containing the "l5d-" 
>>> info is present whenever your code runs. Next, you need a ClientInterceptor 
>>> that looks at the current io.grpc.Context, and merges "l5d-" headers into 
>>> any outgoing headers.
>>>
>>> I have created a branch on github that shows what I mean: 
>>> https://github.com/grpc/grpc-java/compare/master...zpencer:grpc-io-l5d-demo?expand=1
>>>
>>>
>>>
>>> On Thu, Jul 27, 2017 at 9:01 PM, Ankur Chauhan <an...@malloc64.com> 
>>> wrote:
>>>
>>>> Hi,
>>>>
>>>> I am trying to integrate some of our grpc services with linkerd ( a 
>>>> service mesh ). As a part of doing that, the recommendation is that the 
>>>> apps should propagate any headers with "l5d-" prefix to all downstream 
>>>> services. I tried looking at some tests for io.grpc.Context but can't 
>>>> figure out what would be the best way to do this. Anyone care to comment? 
>>>> This should be similar to how deadlines are propagated (i think).
>>>>
>>>> -- Ankur
>>>>
>>>> -- 
>>>> You received this message because you are subscribed to the Google 
>>>> Groups "grpc.io" group.
>>>> To unsubscribe from this group and stop receiving emails from it, send 
>>>> an email to grpc-io+u...@googlegroups.com.
>>>> To post to this group, send email to grp...@googlegroups.com.
>>>> Visit this group at https://groups.google.com/group/grpc-io.
>>>> To view this discussion on the web visit 
>>>> https://groups.google.com/d/msgid/grpc-io/19357e59-34d8-461a-997e-287cac9f096b%40googlegroups.com
>>>>  
>>>> <https://groups.google.com/d/msgid/grpc-io/19357e59-34d8-461a-997e-287cac9f096b%40googlegroups.com?utm_medium=email_source=footer>
>>>> .
>>>> For more options, visit https://groups.google.com/d/optout.
>>>>
>>>
>>>
>>>
>>> -- 
>>> Spencer Fang
>>>
>>
>>
>>
>> -- 
>> Spencer Fang
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"grpc.io" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to grpc-io+unsubscr...@googlegroups.com.
To post to this group, send email to grpc-io@googlegroups.com.
Visit this group at https://groups.google.com/group/grpc-io.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/grpc-io/6c2c9790-2ac0-442b-a3e7-1904cc62679f%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [grpc-io] [grpc-java] Header propagation in service mesh (linkerd)

2017-07-30 Thread Ankur Chauhan
Thanks for the code example Spencer. This is exactly what I needed. I'll 
make it into a library and publish it for others to enjoy once i test it a 
little bit.

On Friday, July 28, 2017 at 10:41:34 AM UTC-7, Spencer Fang wrote:
>
> In gist form for easier mailing list archival, because I'd probably clean 
> up that branch at some point:
> https://gist.github.com/zpencer/e581ae1b55974427512d1ad8f157e323
>
> On Fri, Jul 28, 2017 at 10:28 AM, Spencer Fang <spenc...@google.com 
> > wrote:
>
>> Hello Ankur,
>> The io.grpc.Context is indeed the right thing to use here. What you want 
>> to do is to first create a ServerInterceptor that looks at the incoming 
>> headers and pulls out all "l5d-" prefixed keys. The ServerCall.Listener is 
>> how your application defines how to respond to incoming requests, so the 
>> interceptor must make sure that a io.grpc.Context containing the "l5d-" 
>> info is present whenever your code runs. Next, you need a ClientInterceptor 
>> that looks at the current io.grpc.Context, and merges "l5d-" headers into 
>> any outgoing headers.
>>
>> I have created a branch on github that shows what I mean: 
>> https://github.com/grpc/grpc-java/compare/master...zpencer:grpc-io-l5d-demo?expand=1
>>
>>
>>
>> On Thu, Jul 27, 2017 at 9:01 PM, Ankur Chauhan <an...@malloc64.com 
>> > wrote:
>>
>>> Hi,
>>>
>>> I am trying to integrate some of our grpc services with linkerd ( a 
>>> service mesh ). As a part of doing that, the recommendation is that the 
>>> apps should propagate any headers with "l5d-" prefix to all downstream 
>>> services. I tried looking at some tests for io.grpc.Context but can't 
>>> figure out what would be the best way to do this. Anyone care to comment? 
>>> This should be similar to how deadlines are propagated (i think).
>>>
>>> -- Ankur
>>>
>>> -- 
>>> You received this message because you are subscribed to the Google 
>>> Groups "grpc.io" group.
>>> To unsubscribe from this group and stop receiving emails from it, send 
>>> an email to grpc-io+u...@googlegroups.com .
>>> To post to this group, send email to grp...@googlegroups.com 
>>> .
>>> Visit this group at https://groups.google.com/group/grpc-io.
>>> To view this discussion on the web visit 
>>> https://groups.google.com/d/msgid/grpc-io/19357e59-34d8-461a-997e-287cac9f096b%40googlegroups.com
>>>  
>>> <https://groups.google.com/d/msgid/grpc-io/19357e59-34d8-461a-997e-287cac9f096b%40googlegroups.com?utm_medium=email_source=footer>
>>> .
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>
>>
>>
>> -- 
>> Spencer Fang
>>
>
>
>
> -- 
> Spencer Fang
>

-- 
You received this message because you are subscribed to the Google Groups 
"grpc.io" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to grpc-io+unsubscr...@googlegroups.com.
To post to this group, send email to grpc-io@googlegroups.com.
Visit this group at https://groups.google.com/group/grpc-io.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/grpc-io/1560f0b6-a4b9-412d-bf2c-d78e5c8086fe%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


[grpc-io] [grpc-java] Header propagation in service mesh (linkerd)

2017-07-27 Thread Ankur Chauhan
Hi,

I am trying to integrate some of our grpc services with linkerd ( a service 
mesh ). As a part of doing that, the recommendation is that the apps should 
propagate any headers with "l5d-" prefix to all downstream services. I 
tried looking at some tests for io.grpc.Context but can't figure out what 
would be the best way to do this. Anyone care to comment? This should be 
similar to how deadlines are propagated (i think).

-- Ankur

-- 
You received this message because you are subscribed to the Google Groups 
"grpc.io" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to grpc-io+unsubscr...@googlegroups.com.
To post to this group, send email to grpc-io@googlegroups.com.
Visit this group at https://groups.google.com/group/grpc-io.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/grpc-io/19357e59-34d8-461a-997e-287cac9f096b%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Using timers to flush elements buffered using state API

2017-06-13 Thread Ankur Chauhan
Hi all,


I am trying to figure out how I can use the State api and Timer api to
build a DoFn that can build session by buffering some of the elements till
all required information is available or the session window gap duration
expires.


Currently, I have the following defined in my DoFn<KV<String, VideoEvent>>

@StateId("pendingVideoEvents")
private final StateSpec<BagState>
_PENDING_VIDEO_EVENTS_STATE_SPEC =
StateSpecs.bag(ProtoCoder.of(VideoEvent.class));

@TimerId("finallyCleanup")
private final TimerSpec _FINALLY_TIMER_SPEC =
TimerSpecs.timer(TimeDomain.EVENT_TIME);

@OnTimer("finallyCleanup")
public void onFinallyCleanup(OnTimerContext c,
@StateId("pendingVideoEvents") BagState
pendingVideoEventsState) {
Iterable events = pendingVideoEventsState.read();
if (events == null) {
return;
}
for (VideoEvent event : events) {
String key = event.getSession() + "/" + event.getVideo();
LOG.debug("Emitting {} -> {}", key, event);
c.output(KV.of(key, event));
}
}

The processElement method sets the timer when stuff is added to the
pendingVideoState state object as follows:

// set the timer to some point
finallyTimer.offset(gapDuration).align(Duration.millis(1)).setRelative();


Here I am not too sure if I used the timer api correctly. I see lots of
errors:

"java.lang.IllegalStateException: TimestampCombiner moved element from
2017-06-14T04:27:05.000Z to earlier time 2017-06-14T04:17:13.732Z for
window [2017-06-14T04:12:13.733Z..2017-06-14T04:17:13.733Z)
at
org.apache.beam.runners.core.java.repackaged.com.google.common.base.Preconditions.checkState(Preconditions.java:738)
at org.apache.beam.runners.core.WatermarkHold.shift(WatermarkHold.java:206)
at
org.apache.beam.runners.core.WatermarkHold.addElementHold(WatermarkHold.java:239)
at
org.apache.beam.runners.core.WatermarkHold.addHolds(WatermarkHold.java:190)
at
org.apache.beam.runners.core.ReduceFnRunner.processElement(ReduceFnRunner.java:608)
at
org.apache.beam.runners.core.ReduceFnRunner.processElements(ReduceFnRunner.java:347)
at
com.google.cloud.dataflow.worker.runners.worker.GroupAlsoByWindowViaWindowSetDoFn.processElement(GroupAlsoByWindowViaWindowSetDoFn.java:89)
at
com.google.cloud.dataflow.worker.runners.worker.SimpleOldDoFnRunner.invokeProcessElement(SimpleOldDoFnRunner.java:122)
at
com.google.cloud.dataflow.worker.runners.worker.SimpleOldDoFnRunner.processElement(SimpleOldDoFnRunner.java:101)
at
org.apache.beam.runners.core.LateDataDroppingDoFnRunner.processElement(LateDataDroppingDoFnRunner.java:73)
at
com.google.cloud.dataflow.worker.runners.worker.GroupAlsoByWindowsParDoFn.processElement(GroupAlsoByWindowsParDoFn.java:106)
at
com.google.cloud.dataflow.worker.runners.worker.ForwardingParDoFn.processElement(ForwardingParDoFn.java:42)
at
com.google.cloud.dataflow.worker.runners.worker.DataflowWorkerLoggingParDoFn.processElement(DataflowWorkerLoggingParDoFn.java:47)
at
com.google.cloud.dataflow.worker.util.common.worker.ParDoOperation.process(ParDoOperation.java:48)
at
com.google.cloud.dataflow.worker.util.common.worker.OutputReceiver.process(OutputReceiver.java:52)
at
com.google.cloud.dataflow.worker.util.common.worker.ReadOperation.runReadLoop(ReadOperation.java:198)
at
com.google.cloud.dataflow.worker.util.common.worker.ReadOperation.start(ReadOperation.java:159)
at
com.google.cloud.dataflow.worker.util.common.worker.MapTaskExecutor.execute(MapTaskExecutor.java:72)
at
com.google.cloud.dataflow.worker.runners.worker.StreamingDataflowWorker.process(StreamingDataflowWorker.java:791)
at
com.google.cloud.dataflow.worker.runners.worker.StreamingDataflowWorker.access$600(StreamingDataflowWorker.java:104)
at
com.google.cloud.dataflow.worker.runners.worker.StreamingDataflowWorker$9.run(StreamingDataflowWorker.java:873)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
"
Which makes me think I am getting something wrong with the timers. Can
someone explain the correct way to "flush" state in case of window closure
(with or without timers). Also a general description / example of using the
timer api may be helpful.

-- Ankur Chauhan


Re: How to decrease latency when using PubsubIO.Read?

2017-05-24 Thread Ankur Chauhan
There are two main things to see here:

* In the logs, are there any messages like "No operations completed within the 
last 61 seconds. There are still 1 simple operations and 1 complex operations 
in progress.” This means you are underscaled on the bigtable side and would 
benefit from  increasing the node count.
* We also saw some improvement in performance (workload dependent) by going to 
a bigger worker machine type.
* Another optimization that worked for our use case:

// streaming dataflow has larger machines with smaller bundles, so we can queue 
up a lot more without blowing up
private static BigtableOptions 
createStreamingBTOptions(AnalyticsPipelineOptions opts) {
return new BigtableOptions.Builder()
.setProjectId(opts.getProject())
.setInstanceId(opts.getBigtableInstanceId())
.setUseCachedDataPool(true)
.setDataChannelCount(32)
.setBulkOptions(new BulkOptions.Builder()
.setUseBulkApi(true)
.setBulkMaxRowKeyCount(2048)
.setBulkMaxRequestSize(8_388_608L)
.setAsyncMutatorWorkerCount(32)
.build())
.build();
}

There is a lot of trial and error involved in getting the end-to-end latency 
down so I would suggest enabling the profiling using the —saveProfilesToGcs 
option and get a sense of what is exactly happening.

— Ankur Chauhan

> On May 24, 2017, at 9:09 AM, Josh <jof...@gmail.com> wrote:
> 
> Ah ok - I am using the Dataflow runner. I didn't realise about the custom 
> implementation being provided at runtime...
> 
> Any ideas of how to tweak my job to either lower the latency consuming from 
> PubSub or to lower the latency in writing to Bigtable?
> 
> 
> On Wed, May 24, 2017 at 4:14 PM, Lukasz Cwik <lc...@google.com 
> <mailto:lc...@google.com>> wrote:
> What runner are you using (Flink, Spark, Google Cloud Dataflow, Apex, ...)?
> 
> On Wed, May 24, 2017 at 8:09 AM, Ankur Chauhan <an...@malloc64.com 
> <mailto:an...@malloc64.com>> wrote:
> Sorry that was an autocorrect error. I meant to ask - what dataflow runner 
> are you using? If you are using google cloud dataflow then the PubsubIO class 
> is not the one doing the reading from the pubsub topic. They provide a custom 
> implementation at run time.
> 
> Ankur Chauhan 
> Sent from my iPhone
> 
> On May 24, 2017, at 07:52, Josh <jof...@gmail.com <mailto:jof...@gmail.com>> 
> wrote:
> 
>> Hi Ankur,
>> 
>> What do you mean by runner address?
>> Would you be able to link me to the comment you're referring to?
>> 
>> I am using the PubsubIO.Read class from Beam 2.0.0 as found here:
>> https://github.com/apache/beam/blob/release-2.0.0/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/pubsub/PubsubIO.java
>>  
>> <https://github.com/apache/beam/blob/release-2.0.0/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/pubsub/PubsubIO.java>
>> 
>> Thanks,
>> Josh
>> 
>> On Wed, May 24, 2017 at 3:36 PM, Ankur Chauhan <an...@malloc64.com 
>> <mailto:an...@malloc64.com>> wrote:
>> What runner address you using. Google cloud dataflow uses a closed source 
>> version of the pubsub reader as noted in a comment on Read class. 
>> 
>> Ankur Chauhan
>> Sent from my iPhone
>> 
>> On May 24, 2017, at 04:05, Josh <jof...@gmail.com <mailto:jof...@gmail.com>> 
>> wrote:
>> 
>>> Hi all,
>>> 
>>> I'm using PubsubIO.Read to consume a Pubsub stream, and my job then writes 
>>> the data out to Bigtable. I'm currently seeing a latency of 3-5 seconds 
>>> between the messages being published and being written to Bigtable.
>>> 
>>> I want to try and decrease the latency to <1s if possible - does anyone 
>>> have any tips for doing this? 
>>> 
>>> I noticed that there is a PubsubGrpcClient 
>>> https://github.com/apache/beam/blob/release-2.0.0/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/pubsub/PubsubGrpcClient.java
>>>  
>>> <https://github.com/apache/beam/blob/release-2.0.0/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/pubsub/PubsubGrpcClient.java>
>>>  however the PubsubUnboundedSource is initialised with a PubsubJsonClient, 
>>> so the Grpc client doesn't appear to be being used. Is there a way to 
>>> switch to the Grpc client - as perhaps that would give better performance?
>>> 
>>> Also, I am running my job on Dataflow using autoscaling, which has only 
>>> allocated one n1-standard-4 instance to the job, which is running at ~50% 
>>> CPU. Could forcing a higher number of nodes help improve latency?
>>> 
>>> Thanks for any advice,
>>> Josh
>> 
> 
> 



Re: How to decrease latency when using PubsubIO.Read?

2017-05-24 Thread Ankur Chauhan
What runner address you using. Google cloud dataflow uses a closed source 
version of the pubsub reader as noted in a comment on Read class. 

Ankur Chauhan
Sent from my iPhone

> On May 24, 2017, at 04:05, Josh <jof...@gmail.com> wrote:
> 
> Hi all,
> 
> I'm using PubsubIO.Read to consume a Pubsub stream, and my job then writes 
> the data out to Bigtable. I'm currently seeing a latency of 3-5 seconds 
> between the messages being published and being written to Bigtable.
> 
> I want to try and decrease the latency to <1s if possible - does anyone have 
> any tips for doing this? 
> 
> I noticed that there is a PubsubGrpcClient 
> https://github.com/apache/beam/blob/release-2.0.0/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/pubsub/PubsubGrpcClient.java
>  however the PubsubUnboundedSource is initialised with a PubsubJsonClient, so 
> the Grpc client doesn't appear to be being used. Is there a way to switch to 
> the Grpc client - as perhaps that would give better performance?
> 
> Also, I am running my job on Dataflow using autoscaling, which has only 
> allocated one n1-standard-4 instance to the job, which is running at ~50% 
> CPU. Could forcing a higher number of nodes help improve latency?
> 
> Thanks for any advice,
> Josh


Re: Reprocessing historic data with streaming jobs

2017-05-01 Thread Ankur Chauhan
I have sort of a similar usecase when dealing with failed / cancelled / broken 
streaming pipelines.
We have an operator that continuously monitors the min-watermark of the 
pipeline and when it detects that the watermark is not advancing for more than 
some threshold. We start a new pipeline and initiate a "patcher" batch dataflow 
that reads the event backups over the possibly broken time range (+/- 1 hour).
It works out well but has the overhead of having to build out an external 
operator process that can detect when to do the batch dataflow process. 

Sent from my iPhone

> On May 1, 2017, at 09:37, Thomas Groh  wrote:
> 
> You should also be able to simply add a Bounded Read from the backup data 
> source to your pipeline and flatten it with your Pubsub topic. Because all of 
> the elements produced by both the bounded and unbounded sources will have 
> consistent timestamps, when you run the pipeline the watermark will be held 
> until all of the data is read from the bounded sources. Once this is done, 
> your pipeline can continue processing only elements from the PubSub source. 
> If you don't want the backlog and the current processing to occur in the same 
> pipeline, running the same pipeline but just reading from the archival data 
> should be sufficient (all of the processing would be identical, just the 
> source would need to change).
> 
> If you read from both the "live" and "archival" sources within the same 
> pipeline, you will need to use additional machines so the backlog can be 
> processed promptly if you use a watermark based trigger; watermarks will be 
> held until the bounded source is fully processed.
> 
>> On Mon, May 1, 2017 at 9:29 AM, Lars BK  wrote:
>> I did not see Lukasz reply before I posted, and I will have to read it a bit 
>> later!
>> 
>>> man. 1. mai 2017 kl. 18.28 skrev Lars BK :
>>> Yes, precisely. 
>>> 
>>> I think that could work, yes. What you are suggesting sounds like idea 2) 
>>> in my original question.
>>> 
>>> My main concern is that I would have to allow a great deal of lateness and 
>>> that old windows would consume too much memory. Whether it works in my case 
>>> or not I don't know yet as I haven't tested it. 
>>> 
>>> What if I had to process even older data? Could I handle any "oldness" of 
>>> data by increasing the allowed lateness and throwing machines at the 
>>> problem to hold all the old windows in memory while the backlog is 
>>> processed? If so, great! But I would have to dial the allowed lateness back 
>>> down when the processing has caught up with the present. 
>>> 
>>> Is there some intended way of handling reprocessing like this? Maybe not? 
>>> Perhaps it is more of a Pubsub and Dataflow question than a Beam question 
>>> when it comes down to it. 
>>> 
>>> 
 man. 1. mai 2017 kl. 17.25 skrev Jean-Baptiste Onofré :
 OK, so the messages are "re-publish" on the topic, with the same timestamp 
 as
 the original and consume again by the pipeline.
 
 Maybe, you can play with the allowed lateness and late firings ?
 
 Something like:
 
Window.into(FixedWindows.of(Duration.minutes(xx)))
.triggering(AfterWatermark.pastEndOfWindow()

 .withEarlyFirings(AfterProcessingTime.pastFirstElementInPane()
.plusDelayOf(FIVE_MINUTES))

 .withLateFirings(AfterProcessingTime.pastFirstElementInPane()
.plusDelayOf(TEN_MINUTES)))
.withAllowedLateness(Duration.minutes()
.accumulatingFiredPanes())
 
 Thoughts ?
 
 Regards
 JB
 
 On 05/01/2017 05:12 PM, Lars BK wrote:
 > Hi Jean-Baptiste,
 >
 > I think the key point in my case is that I have to process or reprocess 
 > "old"
 > messages. That is, messages that are late because they are streamed from 
 > an
 > archive file and are older than the allowed lateness in the pipeline.
 >
 > In the case I described the messages had already been processed once and 
 > no
 > longer in the topic, so they had to be sent and processed again. But it 
 > might as
 > well have been that I had received a backfill of data that absolutely 
 > needs to
 > be processed regardless of it being later than the allowed lateness with 
 > respect
 > to present time.
 >
 > So when I write this now it really sounds like I either need to allow 
 > more
 > lateness or somehow rewind the watermark!
 >
 > Lars
 >
 > man. 1. mai 2017 kl. 16.34 skrev Jean-Baptiste Onofré  >:
 >
 > Hi Lars,
 >
 > interesting use case indeed ;)
 >
 > Just to understand: if possible, you don't want to re-consume the 
 > messages from
 > 

Cloning options instances

2017-04-21 Thread Ankur Chauhan
Hi 

I am attempting to build a sort of a supervisor process for dataflow pipelines 
that starts up dataflow jobs based on some inputs combined with some common 
presets. I used to use the cloneAs method on the pipelineoptions class but 
noticed that if has been removed and wanted to ask if there was a way to sort 
of "serialize", "deserialize" and clone existing option instances for easy 
reusability. 

-- AC

Sent from my iPhone

Re: Renaming SideOutput

2017-04-12 Thread Ankur Chauhan
This question maybe obvious to others but why is there a distinction between 
main output and additional outputs? Why not just have a simple list of outputs 
where the first one is the Main one. 

-- AC 

Sent from my iPhone

> On Apr 12, 2017, at 18:08, Melissa Pashniak  
> wrote:
> 
> I agree, I'll create a PR with the doc changes (the rename + text changes
> to make things more clear). I know of at least 2 places we refer to side
> outputs (programming guide and the "Design your pipeline" page).
> 
> 
> On Tue, Apr 11, 2017 at 5:34 PM, Thomas Groh 
> wrote:
> 
>> I think that's a good idea. I would call the outputs of a ParDo the "Main
>> Output" and "Additional Outputs" - it seems like an easy way to make it
>> clear that there's one output that is always expected, and there may be
>> more.
>> 
>> On Tue, Apr 11, 2017 at 5:29 PM, Robert Bradshaw <
>> rober...@google.com.invalid> wrote:
>> 
>>> We should do some renaming in Python too. Right now we have
>>> SideOutputValue which I'd propose naming TaggedOutput or something
>>> like that.
>>> 
>>> Should the docs change too?
>>> https://beam.apache.org/documentation/programming-
>> guide/#transforms-sideio
>>> 
>>> On Tue, Apr 11, 2017 at 5:25 PM, Kenneth Knowles >> 
>>> wrote:
 +1 ditto about sideInput and sideOutput not actually being related
 
 On Tue, Apr 11, 2017 at 3:52 PM, Robert Bradshaw <
 rober...@google.com.invalid> wrote:
 
> +1, I think this is a lot clearer.
> 
> On Tue, Apr 11, 2017 at 2:24 PM, Stephen Sisk >> 
> wrote:
>> strong +1 for changing the name away from sideOutput - the fact that
>> sideInput and sideOutput are not really related was definitely a
>>> source
> of
>> confusion for me when learning beam.
>> 
>> S
>> 
>> On Tue, Apr 11, 2017 at 1:56 PM Thomas Groh
>> > wrote:
>> 
>>> Hey everyone:
>>> 
>>> I'd like to rename DoFn.Context#sideOutput to #output (in the Java
>>> SDK).
>>> 
>>> Having two methods, both named output, one which takes the "main
>>> output
>>> type" and one that takes a tag to specify the type more clearly
>>> communicates the actual behavior - sideOutput isn't a "special" way
>>> to
>>> output, it's the same as output(T), just to a specified
>> PCollection.
> This
>>> will help pipeline authors understand the actual behavior of
>>> outputting
> to
>>> a tag, and detangle it from "sideInput", which is a special way to
> receive
>>> input. Giving them the same name means that it's not even strange
>> to
> call
>>> output and provide the main output type, which is what we want -
>>> it's a
>>> more specific way to output, but does not have different
>>> restrictions or
>>> capabilities.
>>> 
>>> This is also a pretty small change within the SDK - it touches
>> about
>>> 20
>>> files, and the changes are pretty automatic.
>>> 
>>> Thanks,
>>> 
>>> Thomas
>>> 
> 
>>> 
>> 


Re: Emitting results saved using state api without input (cleanup)

2017-04-12 Thread Ankur Chauhan
Hi Kenneth,

Thanks for the pointer, I wrote some tests that exercise the same code using 
PAssert by sending two events that get buffered by the code mentioned 
originally. I notice that the @OnTimer(“…”) method never got invoked. Is that a 
know issue with the TestPipeline?

— AC

> On Apr 11, 2017, at 10:28 AM, Ankur Chauhan <an...@malloc64.com> wrote:
> 
> Thanks for confirming a hunch that I had. I was considering doing that but 
> the javadoc saying "this feature is not implemented by any runner" sort of 
> put me off. 
> 
> Is there a more up to date list of similar in progress features? If not it 
> may be helpful to keep one. 
> 
> Thanks!
> Ankur Chauhan. 
> 
> Sent from my iPhone
> 
> On Apr 11, 2017, at 07:18, Kenneth Knowles <k...@google.com 
> <mailto:k...@google.com>> wrote:
> 
>> Hi Ankur,
>> 
>> If I understand your desire, then what you need for such a use case is an 
>> event time timer, to flush when you are ready. You might choose the end of 
>> the window, the window GC time or, in the global window, just some later 
>> watermark.
>> 
>> new DoFn<...>() {
>> 
>>   @StateId("buffer")
>>   private final StateSpec<Object, BagState> bufferSpec = 
>> StateSpec.bag(...)
>> 
>>   @TimerId("finallyCleanup")
>>   private final TimerSpec finallySpec = 
>> TimerSpecs.timer(TimeDomain.EVENT_TIME);
>> 
>>   @ProcessElement
>>   public void process(@TimerId("finallyCleanup") Timer cleanupTimer) {
>> cleanupTimer.set(...);
>>   }
>> 
>>   @OnTimer("finallyCleanup")
>>   public void onFinallyCleanup(@StateId("buffer") BagState 
>> buffered) {
>> ...
>>   }
>> }
>> 
>> This feature hasn't been blogged about or documented thoroughly except for a 
>> couple of examples in the DoFn Javadoc. But it is available since 0.6.0.
>> 
>> Kenn
>> 
>> On Tue, Apr 11, 2017 at 3:03 AM, Ankur Chauhan <an...@malloc64.com 
>> <mailto:an...@malloc64.com>> wrote:
>> Hi,
>> 
>> I am attempting to do a seemingly simple task using the new state api. I 
>> have created a DoFn<KV<Sting, Event>, KV<String, Event>> that accepts events 
>> keyed by a particular id (session id) and intends to emit the same events 
>> partitioned by as sessionID/eventType. In the simple case this would be a 
>> normal DoFn but there is always a case where some events are not as clean as 
>> we would like and we need to save some state for the session and then emit 
>> those events later when cleanup is complete. For example:
>> 
>> Let’s say that the first few events are missing the eventType (or any other 
>> field), so we would like to buffer those events till we get the first event 
>> with the eventType field set and then use this information to emit the 
>> contents of the buffer with (last observed eventType + original contents of 
>> the buffered events),
>> 
>> For this my initial approach involved creating a BagState which would 
>> contain any buffered events and as more events came in, i would either emit 
>> the input with modification, or add the input to the buffer or, emit the 
>> events in the buffer with the input.
>> 
>> While running my test, I found that if I never get a “good” input, i.e. the 
>> session is only filled with error inputs, I would keep on buffering the 
>> input and never emit anything. My question is, how do i emit this buffer 
>> event when there is no more input?
>> 
>> — Ankur Chauhan
>> 



Re: Renaming SideOutput

2017-04-11 Thread Ankur Chauhan
+1 this is pretty much the topmost things that I found odd when starting with 
the beam model. It would definitely be more intuitive to have a consistent 
name. 

Sent from my iPhone

> On Apr 11, 2017, at 18:29, Aljoscha Krettek  wrote:
> 
> +1
> 
>> On Wed, Apr 12, 2017, at 02:34, Thomas Groh wrote:
>> I think that's a good idea. I would call the outputs of a ParDo the "Main
>> Output" and "Additional Outputs" - it seems like an easy way to make it
>> clear that there's one output that is always expected, and there may be
>> more.
>> 
>> On Tue, Apr 11, 2017 at 5:29 PM, Robert Bradshaw <
>> rober...@google.com.invalid> wrote:
>> 
>>> We should do some renaming in Python too. Right now we have
>>> SideOutputValue which I'd propose naming TaggedOutput or something
>>> like that.
>>> 
>>> Should the docs change too?
>>> https://beam.apache.org/documentation/programming-guide/#transforms-sideio
>>> 
>>> On Tue, Apr 11, 2017 at 5:25 PM, Kenneth Knowles 
>>> wrote:
 +1 ditto about sideInput and sideOutput not actually being related
 
 On Tue, Apr 11, 2017 at 3:52 PM, Robert Bradshaw <
 rober...@google.com.invalid> wrote:
 
> +1, I think this is a lot clearer.
> 
> On Tue, Apr 11, 2017 at 2:24 PM, Stephen Sisk 
> wrote:
>> strong +1 for changing the name away from sideOutput - the fact that
>> sideInput and sideOutput are not really related was definitely a
>>> source
> of
>> confusion for me when learning beam.
>> 
>> S
>> 
>> On Tue, Apr 11, 2017 at 1:56 PM Thomas Groh > wrote:
>> 
>>> Hey everyone:
>>> 
>>> I'd like to rename DoFn.Context#sideOutput to #output (in the Java
>>> SDK).
>>> 
>>> Having two methods, both named output, one which takes the "main
>>> output
>>> type" and one that takes a tag to specify the type more clearly
>>> communicates the actual behavior - sideOutput isn't a "special" way
>>> to
>>> output, it's the same as output(T), just to a specified PCollection.
> This
>>> will help pipeline authors understand the actual behavior of
>>> outputting
> to
>>> a tag, and detangle it from "sideInput", which is a special way to
> receive
>>> input. Giving them the same name means that it's not even strange to
> call
>>> output and provide the main output type, which is what we want -
>>> it's a
>>> more specific way to output, but does not have different
>>> restrictions or
>>> capabilities.
>>> 
>>> This is also a pretty small change within the SDK - it touches about
>>> 20
>>> files, and the changes are pretty automatic.
>>> 
>>> Thanks,
>>> 
>>> Thomas
>>> 
> 
>>> 


Re: Emitting results saved using state api without input (cleanup)

2017-04-11 Thread Ankur Chauhan
Thanks for confirming a hunch that I had. I was considering doing that but the 
javadoc saying "this feature is not implemented by any runner" sort of put me 
off. 

Is there a more up to date list of similar in progress features? If not it may 
be helpful to keep one. 

Thanks!
Ankur Chauhan. 

Sent from my iPhone

> On Apr 11, 2017, at 07:18, Kenneth Knowles <k...@google.com> wrote:
> 
> Hi Ankur,
> 
> If I understand your desire, then what you need for such a use case is an 
> event time timer, to flush when you are ready. You might choose the end of 
> the window, the window GC time or, in the global window, just some later 
> watermark.
> 
> new DoFn<...>() {
> 
>   @StateId("buffer")
>   private final StateSpec<Object, BagState> bufferSpec = 
> StateSpec.bag(...)
> 
>   @TimerId("finallyCleanup")
>   private final TimerSpec finallySpec = 
> TimerSpecs.timer(TimeDomain.EVENT_TIME);
> 
>   @ProcessElement
>   public void process(@TimerId("finallyCleanup") Timer cleanupTimer) {
> cleanupTimer.set(...);
>   }
> 
>   @OnTimer("finallyCleanup")
>   public void onFinallyCleanup(@StateId("buffer") BagState buffered) 
> {
> ...
>   }
> }
> 
> This feature hasn't been blogged about or documented thoroughly except for a 
> couple of examples in the DoFn Javadoc. But it is available since 0.6.0.
> 
> Kenn
> 
>> On Tue, Apr 11, 2017 at 3:03 AM, Ankur Chauhan <an...@malloc64.com> wrote:
>> Hi,
>> 
>> I am attempting to do a seemingly simple task using the new state api. I 
>> have created a DoFn<KV<Sting, Event>, KV<String, Event>> that accepts events 
>> keyed by a particular id (session id) and intends to emit the same events 
>> partitioned by as sessionID/eventType. In the simple case this would be a 
>> normal DoFn but there is always a case where some events are not as clean as 
>> we would like and we need to save some state for the session and then emit 
>> those events later when cleanup is complete. For example:
>> 
>> Let’s say that the first few events are missing the eventType (or any other 
>> field), so we would like to buffer those events till we get the first event 
>> with the eventType field set and then use this information to emit the 
>> contents of the buffer with (last observed eventType + original contents of 
>> the buffered events),
>> 
>> For this my initial approach involved creating a BagState which would 
>> contain any buffered events and as more events came in, i would either emit 
>> the input with modification, or add the input to the buffer or, emit the 
>> events in the buffer with the input.
>> 
>> While running my test, I found that if I never get a “good” input, i.e. the 
>> session is only filled with error inputs, I would keep on buffering the 
>> input and never emit anything. My question is, how do i emit this buffer 
>> event when there is no more input?
>> 
>> — Ankur Chauhan
> 


Emitting results saved using state api without input (cleanup)

2017-04-11 Thread Ankur Chauhan
Hi,

I am attempting to do a seemingly simple task using the new state api. I have 
created a DoFn<KV<Sting, Event>, KV<String, Event>> that accepts events keyed 
by a particular id (session id) and intends to emit the same events partitioned 
by as sessionID/eventType. In the simple case this would be a normal DoFn but 
there is always a case where some events are not as clean as we would like and 
we need to save some state for the session and then emit those events later 
when cleanup is complete. For example:

Let’s say that the first few events are missing the eventType (or any other 
field), so we would like to buffer those events till we get the first event 
with the eventType field set and then use this information to emit the contents 
of the buffer with (last observed eventType + original contents of the buffered 
events),

For this my initial approach involved creating a BagState which would 
contain any buffered events and as more events came in, i would either emit the 
input with modification, or add the input to the buffer or, emit the events in 
the buffer with the input.

While running my test, I found that if I never get a “good” input, i.e. the 
session is only filled with error inputs, I would keep on buffering the input 
and never emit anything. My question is, how do i emit this buffer event when 
there is no more input?

— Ankur Chauhan

[grpc-io] [grpc-java] Lots of Stream Error messages while doing Server-side streaming RPC

2016-12-22 Thread Ankur Chauhan

I am building a grpc server that queries Google cloud bigtable and based on 
a user request and delivers a stream of de-serialized (protobuf) rows to 
the user.

I have noticed that there are a lot of "Stream Error" messages in logs:

"Stream Error
io.netty.handler.codec.http2.Http2Exception$StreamException: Stream closed 
before write could take place
at 
io.netty.handler.codec.http2.Http2Exception.streamError(Http2Exception.java:147)
at 
io.netty.handler.codec.http2.DefaultHttp2RemoteFlowController$FlowState.cancel(DefaultHttp2RemoteFlowController.java:487)
at 
io.netty.handler.codec.http2.DefaultHttp2RemoteFlowController$FlowState.cancel(DefaultHttp2RemoteFlowController.java:468)
at 
io.netty.handler.codec.http2.DefaultHttp2RemoteFlowController$1.onStreamClosed(DefaultHttp2RemoteFlowController.java:103)
at 
io.netty.handler.codec.http2.DefaultHttp2Connection.notifyClosed(DefaultHttp2Connection.java:343)
at 
io.netty.handler.codec.http2.DefaultHttp2Connection$ActiveStreams.removeFromActiveStreams(DefaultHttp2Connection.java:1168)
at 
io.netty.handler.codec.http2.DefaultHttp2Connection$ActiveStreams.deactivate(DefaultHttp2Connection.java:1116)
at 
io.netty.handler.codec.http2.DefaultHttp2Connection$DefaultStream.close(DefaultHttp2Connection.java:522)
at 
io.netty.handler.codec.http2.DefaultHttp2Connection.close(DefaultHttp2Connection.java:149)
at 
io.netty.handler.codec.http2.Http2ConnectionHandler$BaseDecoder.channelInactive(Http2ConnectionHandler.java:181)
at 
io.netty.handler.codec.http2.Http2ConnectionHandler.channelInactive(Http2ConnectionHandler.java:374)
at 
io.grpc.netty.NettyServerHandler.channelInactive(NettyServerHandler.java:274)
at 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:256)
at 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:242)
at 
io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:235)
at 
io.netty.handler.codec.ByteToMessageDecoder.channelInputClosed(ByteToMessageDecoder.java:360)
at 
io.netty.handler.codec.ByteToMessageDecoder.channelInactive(ByteToMessageDecoder.java:325)
at 
io.netty.handler.ssl.SslHandler.channelInactive(SslHandler.java:726)
at 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:256)
at 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:242)
at 
io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:235)
at 
io.netty.channel.DefaultChannelPipeline$HeadContext.channelInactive(DefaultChannelPipeline.java:1329)
at 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:256)
at 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:242)
at 
io.netty.channel.DefaultChannelPipeline.fireChannelInactive(DefaultChannelPipeline.java:908)
at 
io.netty.channel.AbstractChannel$AbstractUnsafe$7.run(AbstractChannel.java:744)
at 
io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:163)
at 
io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:418)
at 
io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:312)
at 
io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:873)
at 
io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:144)
at java.lang.Thread.run(Thread.java:745)


My method is pretty basic and the following snippet captures the essence of 
the service call.

final Stream rowStream = streamFromBigTable(request);
final ServerCallStreamObserver responseObserver = 
(ServerCallStreamObserver) _responseObserver;
StreamObservers.copyWithFlowControl(rowStream.iterator(), responseObserver);

Can someone elaborate on the origin on these error messages? They seem bad 
but I can't seem to find how to control them.

-- Ankur Chauhan

-- 
You received this message because you are subscribed to the Google Groups 
"grpc.io" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to grpc-io+unsubscr...@googlegroups.com.
To post to this group, send email to grpc-io@googlegroups.com.
Visit this group at https://groups.google.com/group/grpc-io.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/grpc-io/3cfd7a7a-ebb9-4731-93ec-fd7d3294a745%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [grpc-io] Re: grpc + appengine? Is this possible?

2016-08-02 Thread Ankur Chauhan
Do you have a link to the tech talk. I would be very interested in
something like that.

On Mon, Aug 1, 2016 at 9:53 PM Jeff Kusi <jeffk...@gmail.com> wrote:

> Haha Nodir,
>
> So funny enough, I watched your tech talk after asking the question on
> here.
> I'm definitely going to use it :)
>
> 2016-08-01 13:13 GMT-07:00 Nodir Turakulov <no...@google.com>:
>
>> yeah, looks similar
>> there is also UI for making RPCs with docs-from-comments and request
>> autocompletion
>> <https://luci-logdog.appspot.com/rpcexplorer/services/logdog.Logs/Query?request=%7B%20%20%20%20%22project%22:%20%22chromium%22,%20%20%20%20%22proto%22:%20true,%20%20%20%20%22streamType%22:%20%7B%20%20%20%20%20%20%20%20%22value%22:%20%22TEXT%22%20%20%20%20%7D%7D>
>>  and
>> a discovery service
>>
>> On Mon, Aug 1, 2016 at 1:01 PM Ernesto Alejo <ernestoka...@gmail.com>
>> wrote:
>>
>>> It is similar to https://github.com/grpc-ecosystem/grpc-gateway isn't
>>> it? I can see you have a CLI tool that could be great for testing and some
>>> helpers for App Engine though.
>>>
>>> On Monday, August 1, 2016 at 9:46:53 PM UTC+2, no...@google.com wrote:
>>>>
>>>> (shameless plug)
>>>> You may find pRPC useful: basically allows you to run a gRPC service on
>>>> appengine via HTTP 1. Unlike grpc, does not support streaming, supports
>>>> making RPCs from the browser.
>>>>
>>>> https://godoc.org/github.com/luci/luci-go/grpc/prpc
>>>>
>>>> On Saturday, July 30, 2016 at 1:44:45 AM UTC-7, jeff...@gmail.com
>>>> wrote:
>>>>>
>>>>> hi,
>>>>>
>>>>> Any updates on golang grpc support on app-engine?
>>>>> Are we looking at months or a year?
>>>>>
>>>>> On Wednesday, March 30, 2016 at 3:46:26 AM UTC-7, Miguel Vitorino
>>>>> wrote:
>>>>>>
>>>>>>
>>>>>> Found this interim solution from a Google developer
>>>>>> http://nodir.io/post/138899670556/prpc.
>>>>>> Haven't tested it though.. has anyone?
>>>>>>
>>>>>>
>>>>>> On Wednesday, January 20, 2016 at 7:09:25 PM UTC, Ankur Chauhan wrote:
>>>>>>>
>>>>>>> Any updates on this?
>>>>>>>
>>>>>>> On Saturday, August 15, 2015 at 11:21:29 AM UTC-7, Alexander
>>>>>>> Yevsyukov wrote:
>>>>>>>>
>>>>>>>> We are interested in Java support.  Is there a chance for early
>>>>>>>> adopters preview?
>>>>>>>>
>>>>>>>> We've got an app under AppEngine, with Web and Android clients, and
>>>>>>>> iOS client development recently started. Protobuf and gRPC will help 
>>>>>>>> us a
>>>>>>>> lot.
>>>>>>>>
>>>>>>>>
>>>>>>>> On Monday, May 4, 2015 at 8:28:50 PM UTC+2, Louis Ryan wrote:
>>>>>>>>>
>>>>>>>>> +cc Ludo who's team is working on GRPC for AppEngine v2.
>>>>>>>>>
>>>>>>>>> Which language are you interested in doing this with? For Java, I
>>>>>>>>> doubt plugging the AE socket API below one of the GRPC transport
>>>>>>>>> implementation (Netty / OkHttp) would be all that difficult but its 
>>>>>>>>> not
>>>>>>>>> something I've looked at.
>>>>>>>>>
>>>>>>>>> On Mon, May 4, 2015 at 5:29 AM, <eyal...@gmail.com> wrote:
>>>>>>>>>
>>>>>>>>>> Any updates on that?
>>>>>>>>>>
>>>>>>>>>> Assuming not I'm thinking of a mid-term workaround by using
>>>>>>>>>> appengine sockets and building the request directly.
>>>>>>>>>>
>>>>>>>>>> How hard will it be to write a function which is supposed to send
>>>>>>>>>> a specific RPC request, That is, a function which gets a specific 
>>>>>>>>>> proto,
>>>>>>>>>> open a socket to a GCE hosted grpc server send the request (I assume 
>>>>>>>>>> the
>>>>>>>>>> request should have the serialize

[jira] [Commented] (SPARK-6707) Mesos Scheduler should allow the user to specify constraints based on slave attributes

2016-06-12 Thread Ankur Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-6707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15326822#comment-15326822
 ] 

Ankur Chauhan commented on SPARK-6707:
--

Yes that is exactly what was what is wanted to express



> Mesos Scheduler should allow the user to specify constraints based on slave 
> attributes
> --
>
> Key: SPARK-6707
> URL: https://issues.apache.org/jira/browse/SPARK-6707
> Project: Spark
>  Issue Type: Improvement
>  Components: Mesos, Scheduler
>Affects Versions: 1.3.0
>    Reporter: Ankur Chauhan
>Assignee: Ankur Chauhan
>  Labels: mesos, scheduler
> Fix For: 1.5.0
>
>
> Currently, the mesos scheduler only looks at the 'cpu' and 'mem' resources 
> when trying to determine the usablility of a resource offer from a mesos 
> slave node. It may be preferable for the user to be able to ensure that the 
> spark jobs are only started on a certain set of nodes (based on attributes). 
> For example, If the user sets a property, let's say 
> {code}spark.mesos.constraints{code} is set to 
> {code}tachyon=true;us-east-1=false{code}, then the resource offers will be 
> checked to see if they meet both these constraints and only then will be 
> accepted to start new executors.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



Re: Mesos/Marathon/HAProxy Logging

2015-08-25 Thread Ankur Chauhan
This may help: 

http://serverfault.com/questions/331079/haproxy-and-forwarding-client-ip-address-to-servers

We use similar options to ensure we have the remote ip.
 On 25 Aug 2015, at 09:30, John Omernik j...@omernik.com wrote:
 
 I have been playing with an application that is a very simple app: A 
 webservice running in Python. I've created a docker container, it runs in the 
 container, I setup marathon to run it, I use mesos-dns and ha proxy and I can 
 access the service just fine anywhere in the cluster. 
 
 First let me say this is VERY cool. The capabilities here awesome.
 
 Now the challenge: the security guy in me wants to take good logs from my 
 app.  It was setup to do it's own logging through a custom module. I am very 
 happy with it.  I setup the app in the container to mount a volume that's in 
 my MapRFS via NFS so I can log directly to a clustered filesystem. THis is 
 awesome, I can read my logs in Apache Drill as they are written!!!
 
 However, the haproxy through me for a loop. Once I started running the app in 
 Marathon with a service port and routed around via haproxy, I realized 
 something:  I lost my source IPs in my logs? 
 
 Why?
 
 Because once HAProxy takes over, it no longer needs to keep the source IP, 
 and instead the next hop only sees the previous connection IP.  From a 
 service discovery perspective it works great, but with this setup, I'd lose 
 the previous hop. Perhaps I manually add something in haproxy to add an 
 X-forwarded-for header, that would be nice, however, that only works for http 
 apps, what about other TCP apps that are not HTTP? 
 
 This is an interesting problem, because apps should have good logging, 
 security, performance, troubleshooting, and if I can't get the source IP it 
 could be a problem. 
 
 So, my question is this, anyone ran into this? How are you handling it?  Any 
 brainstorms here we may be able to work off of? 
 
 One thing I thought was why are we using HAproxy? Couldn't the same HAProxy 
 script, actually put in forwarding rules in IPtables?  This sounds messy, but 
 could it work? Has anyone explored that? If the data was forwarded, than it 
 wouldn't lose the IP information (and timeouts wouldn't be a concern either 
 (I think I posted before on how long running TCP connections can be closed 
 down by HAProxy if they don't implement TCP Keep alives). 
 
 Other ideas?  This is interesting to me, and likely others. 



Re: MesosCon Seattle attendee introduction thread

2015-08-17 Thread Ankur Chauhan
Hi all,

I am Ankur Chauhan. I am a Sr. Software engineer with the Reporting and 
Analytics team
at Brightcove Inc. I have been evaluating, tinkering, developing with mesos for 
about an year
now. My latest adventure has been in the spark mesos integration and writing 
the new apache flink -
mesos integration.

I am interested in learning about managing stateful services in mesos and 
creating better documentation
for the project.

I am very excited to meet everyone!

-- Ankur Chauhan.

 On 17 Aug 2015, at 00:10, Trevor Powell trevor.pow...@rms.com wrote:
 
 Hey Mesos Family! Can¹t wait to see you all in person.
 
 I¹m Trevor Powell. I am the Product Owner for our TechOps engineering team
 at RMS. RMS is in the catastrophic modeling business. Think of it as
 modeling Acts of God (earthquakes, floods, Godzilla, etc)  on physical
 property and damages associated with them.
 
 We¹ve been evaluating Mesos this year, and we are planning to launch it in
 PRD at the start of next. I am super excited :-)
 
 I am very interested in managing stateful applications inside Mesos. Also
 network segmentation in Mesos (see my ³Mesos, Multinode Workload Network
 segregation² email thread earlier this month).
 
 See you all Thursday!!
 
 Stay Smooth,
 
 --
 
 Trevor Alexander Powell
 Sr. Manager, Cloud Engineer  Architecture
 7575 Gateway Blvd. Newark, CA 94560
 T: +1.510.713.3751
 M: +1.650.325.7467
 www.rms.com
 https://www.linkedin.com/in/trevorapowell
 
 https://github.com/tpowell-rms
 
 
 
 
 
 
 On 8/16/15, 1:58 PM, Dave Lester d...@davelester.org wrote:
 
 Hi All,
 
 I'd like to kick off a thread for folks to introduce themselves in
 advance of #MesosCon
 http://events.linuxfoundation.org/events/mesoscon. Here goes:
 
 My name is Dave Lester, and I'm an Open Source Advocate at Twitter. I am
 a member of the MesosCon program committee, along with a stellar group
 of other community members who have volunteered
 http://events.linuxfoundation.org/events/mesoscon/program/programcommitte
 e.
 Can't wait to meet as many of you as possible.
 
 I'm eager to meet with folks interested in learning more about how we
 deploy and manage services at Twitter using Mesos and Apache Aurora
 http://aurora.apache.org. Twitter has a booth where I'll be hanging
 out for a portion of the conference, feel free to stop by and say hi.
 I'm also interested in connecting with companies that use Mesos; let's
 make sure we add you to our #PoweredByMesos list
 http://mesos.apache.org/documentation/latest/powered-by-mesos/.
 
 I'm also on Twitter: @davelester
 
 Next!
 



signature.asc
Description: Message signed with OpenPGP using GPGMail


Spark streaming and session windows

2015-08-07 Thread Ankur Chauhan
Hi all,

I am trying to figure out how to perform equivalent of Session windows (as 
mentioned in https://cloud.google.com/dataflow/model/windowing) using spark 
streaming. Is it even possible (i.e. possible to do efficiently at scale). Just 
to expand on the definition:

Taken from the google dataflow documentation:

The simplest kind of session windowing specifies a minimum gap duration. All 
data arriving below a minimum threshold of time delay is grouped into the same 
window. If data arrives after the minimum specified gap duration time, this 
initiates the start of a new window.




Any help would be appreciated.

-- Ankur Chauhan


signature.asc
Description: Message signed with OpenPGP using GPGMail


[Spark Streaming] Session based windowing like in google dataflow

2015-08-07 Thread Ankur Chauhan
Hi all,

I am trying to figure out how to perform equivalent of Session windows (as 
mentioned in https://cloud.google.com/dataflow/model/windowing) using spark 
streaming. Is it even possible (i.e. possible to do efficiently at scale). Just 
to expand on the definition:

Taken from the google dataflow documentation:

The simplest kind of session windowing specifies a minimum gap duration. All 
data arriving below a minimum threshold of time delay is grouped into the same 
window. If data arrives after the minimum specified gap duration time, this 
initiates the start of a new window.




Any help would be appreciated.

-- Ankur Chauhan


signature.asc
Description: Message signed with OpenPGP using GPGMail


Re: FYI: Blog Post on Flink's Streaming Performance and Fault Tolerance

2015-08-05 Thread Ankur Chauhan
Pretty awesome piece. 

Sent from my iPhone

 On Aug 5, 2015, at 10:10, Hawin Jiang hawin.ji...@gmail.com wrote:
 
 Great job, Guys
 
 Let me read it carefully. 
 
 
 
 
 
 
 
 On Wed, Aug 5, 2015 at 7:25 AM, Stephan Ewen se...@apache.org wrote:
 I forgot the link ;-)
 
 http://data-artisans.com/high-throughput-low-latency-and-exactly-once-stream-processing-with-apache-flink/
 
 On Wed, Aug 5, 2015 at 4:11 PM, Stephan Ewen se...@apache.org wrote:
 Hi all!
 
 We just published a blog post about how streaming fault tolerance 
 mechanisms evolved, and what kind of performance Flink gets with its 
 checkpointing mechanism.
 
 I think it is a pretty interesting read for people that are interested in 
 Flink or data streaming in general.
 
 The blog post talks about:
 
   - Fault tolerance techniques, starting from acknowledgements, over micro 
 batches, to transactional updates and distributed snapshots.
 
   - Performance of Flink, throughput, latency, and tradeoffs.
 
   - A chaos monkey experiment where computation continues strongly 
 consistent even when periodically killing workers.
 
 
 Comments welcome!
 
 Greetings,
 Stephan
 


Re: FYI: Blog Post on Flink's Streaming Performance and Fault Tolerance

2015-08-05 Thread Ankur Chauhan
Pretty awesome piece. 

Sent from my iPhone

 On Aug 5, 2015, at 10:10, Hawin Jiang hawin.ji...@gmail.com wrote:
 
 Great job, Guys
 
 Let me read it carefully. 
 
 
 
 
 
 
 
 On Wed, Aug 5, 2015 at 7:25 AM, Stephan Ewen se...@apache.org wrote:
 I forgot the link ;-)
 
 http://data-artisans.com/high-throughput-low-latency-and-exactly-once-stream-processing-with-apache-flink/
 
 On Wed, Aug 5, 2015 at 4:11 PM, Stephan Ewen se...@apache.org wrote:
 Hi all!
 
 We just published a blog post about how streaming fault tolerance 
 mechanisms evolved, and what kind of performance Flink gets with its 
 checkpointing mechanism.
 
 I think it is a pretty interesting read for people that are interested in 
 Flink or data streaming in general.
 
 The blog post talks about:
 
   - Fault tolerance techniques, starting from acknowledgements, over micro 
 batches, to transactional updates and distributed snapshots.
 
   - Performance of Flink, throughput, latency, and tradeoffs.
 
   - A chaos monkey experiment where computation continues strongly 
 consistent even when periodically killing workers.
 
 
 Comments welcome!
 
 Greetings,
 Stephan
 


[jira] [Commented] (FLINK-1984) Integrate Flink with Apache Mesos

2015-07-28 Thread Ankur Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14645190#comment-14645190
 ] 

Ankur Chauhan commented on FLINK-1984:
--

I just opened a pull request https://github.com/apache/flink/pull/948 that is a 
beginning to handle flink-mesos integration. Being my first framework and first 
venture into Apache Flink, I would appreciate some early feedback (just in case 
I am going down some wrong path).

 Integrate Flink with Apache Mesos
 -

 Key: FLINK-1984
 URL: https://issues.apache.org/jira/browse/FLINK-1984
 Project: Flink
  Issue Type: New Feature
  Components: New Components
Reporter: Robert Metzger
Priority: Minor
 Attachments: 251.patch


 There are some users asking for an integration of Flink into Mesos.
 There also is a pending pull request for adding Mesos support for Flink: 
 https://github.com/apache/flink/pull/251
 But the PR is insufficiently tested. I'll add the code of the pull request to 
 this JIRA in case somebody wants to pick it up in the future.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Questions about framework development - (HA and reconciling state)

2015-07-25 Thread Ankur Chauhan
Hi all,


I am working on creating an integration between Apache Flink 
(http://flink.apache.org) and mesos which would be similar to the way the 
current hadoop-mesos integration works using the java mesos client.
My current idea is that the scheduler will also run a JobManager process 
(similar to the jobTracker) which will start off a bunch of taskManager 
(similar to the TaskTracker) tasks using a custom executor.

I want to get some feedback and information of the following questions I have:

0. How do i go about the issue of HA at the scheduler level?
I was thinking of using zookeeper based leader election by directly 
maintaining a zookeeper connection myself. Is there a better way to do this 
(something which does not require me to use a self managed zookeeper 
connection)?

1. How do i deal with restarts and reconciling the tasks?
In case the scheduler restarts (currently maintains an in-memory map of 
currently running tasks), How do I go about rediscovering tasks and reconciling 
state?
I was thinking of using DiscoverInfo but I can't find any reference to 
figure out how to query mesos for tasks matching the service discovery 
information. - Any suggestions on how to do this.

3. How does one go about testing frameworks? Any suggestions / pointers.

My work in progress version is at 
https://github.com/ankurcha/flink/tree/flink-mesos/flink-mesos

Any help would be much appreciated.


Thanks!
Ankur


signature.asc
Description: Message signed with OpenPGP using GPGMail


Re: use S3-Compatible Storage with spark

2015-07-17 Thread Ankur Chauhan
The endpoint is the property you want to set. I would look at the source for 
that.

Sent from my iPhone

 On Jul 17, 2015, at 08:55, Sujit Pal sujitatgt...@gmail.com wrote:
 
 Hi Schmirr,
 
 The part after the s3n:// is your bucket name and folder name, ie 
 s3n://${bucket_name}/${folder_name}[/${subfolder_name}]*. Bucket names are 
 unique across S3, so the resulting path is also unique. There is no concept 
 of hostname in s3 urls as far as I know.
 
 -sujit
 
 
 On Fri, Jul 17, 2015 at 1:36 AM, Schmirr Wurst schmirrwu...@gmail.com 
 wrote:
 Hi,
 
 I wonder how to use S3 compatible Storage in Spark ?
 If I'm using s3n:// url schema, the it will point to amazon, is there
 a way I can specify the host somewhere ?
 
 -
 To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
 For additional commands, e-mail: user-h...@spark.apache.org
 


[jira] [Comment Edited] (SPARK-8009) [Mesos] Allow provisioning of executor logging configuration

2015-07-06 Thread Ankur Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-8009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14616085#comment-14616085
 ] 

Ankur Chauhan edited comment on SPARK-8009 at 7/7/15 2:42 AM:
--

I think the correct way to handle this sort of problem is to allow the user to 
specify extra Uris that can be downloaded into the sandbox before the actual 
executor is started. Something via a property like 
{code}spark.executor.additionalUris = 
http://www.example.com/qa/file1.ext,http://www.example.com/qa/file2.ext{code} 

The specific problem of being able to specify the log4j.properties would be the 
combination of specifying this property along with 

{code}spark.executor.extraJavaOptions = 
-Dlog4j.configuration=${MESOS_SANDBOX}/log4j.properties{code}


[~tnachen] Does this sound reasonable? I have a simple change to support this 
use case but would like more input, this can be used to download any additional 
data etc. Patch: 
https://github.com/apache/spark/compare/master...ankurcha:fetch_extra_uris

Also, Does this solution deserve a separate issue or right here is fine ( 
because it solves a broader problem ).


was (Author: ankurcha):
I think the correct way to handle this sort of problem is to allow the user to 
specify extra Uris that can be downloaded into the sandbox before the actual 
executor is started. Something via a property like 
{code}spark.executor.additionalUris = 
http://www.example.com/qa/file1.ext,http://www.example.com/qa/file2.ext{code}. 
The specific problem of being able to specify the log4j.properties would be the 
combination of specifying this property along with 
{code}spark.executor.extraJavaOptions = 
-Dlog4j.configuration=${MESOS_SANDBOX}/log4j.properties{code}


[~tnachen] Does this sound reasonable? I have a simple change to support this 
use case but would like more input, this can be used to download any additional 
data etc. Patch: 
https://github.com/apache/spark/compare/master...ankurcha:fetch_extra_uris

Also, Does this solution deserve a separate issue or right here is fine ( 
because it solves a broader problem ).

 [Mesos] Allow provisioning of executor logging configuration 
 -

 Key: SPARK-8009
 URL: https://issues.apache.org/jira/browse/SPARK-8009
 Project: Spark
  Issue Type: Improvement
  Components: Mesos
Affects Versions: 1.3.1
 Environment: Mesos executor
Reporter: Gerard Maas
  Labels: logging, mesos

 It's currently not possible to provide a custom logging configuration for the 
 Mesos executors. 
 Upon startup of the executor JVM, it loads a default config file from the 
 Spark assembly, visible by this line in stderr: 
  Using Spark's default log4j profile: 
  org/apache/spark/log4j-defaults.properties
 That line comes from Logging.scala [1] where a default config is loaded if 
 none is found in the classpath upon the startup of the Spark Mesos executor 
 in the Mesos sandbox. At that point in time, none of the application-specific 
 resources have been shipped yet, as the executor JVM is just starting up.  
 To load a custom configuration file we should have it already on the sandbox 
 before the executor JVM starts and add it to the classpath on the startup 
 command.
 For the classpath customization, It looks like it should be possible to pass 
 a -Dlog4j.configuration  property by using the 
 'spark.executor.extraClassPath' that will be picked up at [2] and that should 
 be added to the command that starts the executor JVM, but the resource must 
 be already on the host before we can do that. Therefore we need some means of 
 'shipping' the log4j.configuration file to the allocated executor.
 This all boils down to the need of shipping extra files to the sandbox. 
 There's a workaround: open up the Spark assembly, replace the 
 log4j-default.properties and pack it up again.  That would work, although 
 kind of rudimentary as people may use the same assembly for many jobs.  
 Probably, accessing the log4j API programmatically should also work (we 
 didn't try that yet)
 [1] 
 https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/Logging.scala#L128
 [2] 
 https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosSchedulerBackend.scala#L77



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-8009) [Mesos] Allow provisioning of executor logging configuration

2015-07-06 Thread Ankur Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-8009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14616085#comment-14616085
 ] 

Ankur Chauhan commented on SPARK-8009:
--

I think the correct way to handle this sort of problem is to allow the user to 
specify extra Uris that can be downloaded into the sandbox before the actual 
executor is started. Something via a property like 
{code}spark.executor.additionalUris = 
http://www.example.com/qa/file1.ext,http://www.example.com/qa/file2.ext{code}. 
The specific problem of being able to specify the log4j.properties would be the 
combination of specifying this property along with 
{code}spark.executor.extraJavaOptions = 
-Dlog4j.configuration=${MESOS_SANDBOX}/log4j.properties{code}


[~tnachen] Does this sound reasonable? I have a simple change to support this 
use case but would like more input, this can be used to download any additional 
data etc. Patch: 
https://github.com/apache/spark/compare/master...ankurcha:fetch_extra_uris

Also, Does this solution deserve a separate issue or right here is fine ( 
because it solves a broader problem ).

 [Mesos] Allow provisioning of executor logging configuration 
 -

 Key: SPARK-8009
 URL: https://issues.apache.org/jira/browse/SPARK-8009
 Project: Spark
  Issue Type: Improvement
  Components: Mesos
Affects Versions: 1.3.1
 Environment: Mesos executor
Reporter: Gerard Maas
  Labels: logging, mesos

 It's currently not possible to provide a custom logging configuration for the 
 Mesos executors. 
 Upon startup of the executor JVM, it loads a default config file from the 
 Spark assembly, visible by this line in stderr: 
  Using Spark's default log4j profile: 
  org/apache/spark/log4j-defaults.properties
 That line comes from Logging.scala [1] where a default config is loaded if 
 none is found in the classpath upon the startup of the Spark Mesos executor 
 in the Mesos sandbox. At that point in time, none of the application-specific 
 resources have been shipped yet, as the executor JVM is just starting up.  
 To load a custom configuration file we should have it already on the sandbox 
 before the executor JVM starts and add it to the classpath on the startup 
 command.
 For the classpath customization, It looks like it should be possible to pass 
 a -Dlog4j.configuration  property by using the 
 'spark.executor.extraClassPath' that will be picked up at [2] and that should 
 be added to the command that starts the executor JVM, but the resource must 
 be already on the host before we can do that. Therefore we need some means of 
 'shipping' the log4j.configuration file to the allocated executor.
 This all boils down to the need of shipping extra files to the sandbox. 
 There's a workaround: open up the Spark assembly, replace the 
 log4j-default.properties and pack it up again.  That would work, although 
 kind of rudimentary as people may use the same assembly for many jobs.  
 Probably, accessing the log4j API programmatically should also work (we 
 didn't try that yet)
 [1] 
 https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/Logging.scala#L128
 [2] 
 https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosSchedulerBackend.scala#L77



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-6707) Mesos Scheduler should allow the user to specify constraints based on slave attributes

2015-06-06 Thread Ankur Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-6707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14576101#comment-14576101
 ] 

Ankur Chauhan commented on SPARK-6707:
--

Hi Kapil,

The patch is pretty much ready to go but I haven't heard from anyone of the 
other reviewers on the pull request so I can't say when it will be in master.



 Mesos Scheduler should allow the user to specify constraints based on slave 
 attributes
 --

 Key: SPARK-6707
 URL: https://issues.apache.org/jira/browse/SPARK-6707
 Project: Spark
  Issue Type: Improvement
  Components: Mesos, Scheduler
Affects Versions: 1.3.0
Reporter: Ankur Chauhan
  Labels: mesos, scheduler

 Currently, the mesos scheduler only looks at the 'cpu' and 'mem' resources 
 when trying to determine the usablility of a resource offer from a mesos 
 slave node. It may be preferable for the user to be able to ensure that the 
 spark jobs are only started on a certain set of nodes (based on attributes). 
 For example, If the user sets a property, let's say 
 {code}spark.mesos.constraints{code} is set to 
 {code}tachyon=true;us-east-1=false{code}, then the resource offers will be 
 checked to see if they meet both these constraints and only then will be 
 accepted to start new executors.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



Re: Cluster autoscaling in Spark+Mesos ?

2015-06-05 Thread Ankur Chauhan
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Hi,

@Sharma - Is mantis/fenzo available on github or something, I did find
some maven artifacts but the repository netflix/fenzo is a 404. I am
interested in learning about the bin packing logic of fenzo.

- -- Ankur Chauhan

On 04/06/2015 22:35, Sharma Podila wrote:
 We Autoscale our Mesos cluster in EC2 from within our framework.
 Scaling up can be easy via watching demand Vs supply. However,
 scaling down requires bin packing the tasks tightly onto as few
 servers as possible. Do you have any specific ideas on how you
 would leverage Mantis/Mesos for Spark based jobs? Fenzo, the
 scheduler part of Mantis, could be another point of leverage, which
 could give a framework the ability to autoscale the cluster among
 other benefits.
 
 
 
 On Thu, Jun 4, 2015 at 1:06 PM, Dmitry Goldenberg 
 dgoldenberg...@gmail.com mailto:dgoldenberg...@gmail.com
 wrote:
 
 Thanks, Vinod. I'm really interested in how we could leverage 
 something like Mantis and Mesos to achieve autoscaling in a 
 Spark-based data processing system...
 
 On Jun 4, 2015, at 3:54 PM, Vinod Kone vinodk...@gmail.com 
 mailto:vinodk...@gmail.com wrote:
 
 Hey Dmitry. At the current time there is no built-in support for 
 Mesos to autoscale nodes in the cluster. I've heard people 
 (Netflix?) do it out of band on EC2.
 
 On Thu, Jun 4, 2015 at 9:08 AM, Dmitry Goldenberg 
 dgoldenberg...@gmail.com mailto:dgoldenberg...@gmail.com
 wrote:
 
 A Mesos noob here. Could someone point me at the doc or summary
 for the cluster autoscaling capabilities in Mesos?
 
 Is there a way to feed it events and have it detect the need to
 bring in more machines or decommission machines?  Is there a way
 to receive events back that notify you that machines have been
 allocated or decommissioned?
 
 Would this work within a certain set of 
 preallocated/pre-provisioned/stand-by machines or will Mesos
 go and grab machines from the cloud?
 
 What are the integration points of Apache Spark and Mesos? What
 are the true advantages of running Spark on Mesos?
 
 Can Mesos autoscale the cluster based on some signals/events 
 coming out of Spark runtime or Spark consumers, then cause the 
 consumers to run on the updated cluster, or signal to the 
 consumers to restart themselves into an updated cluster?
 
 Thanks.
 
 
 
-BEGIN PGP SIGNATURE-

iQEcBAEBAgAGBQJVcUoqAAoJEOSJAMhvLp3LjYIIAK9pgU41hU3Dbn5tlVWxTK7y
knsVOnVYiuA43DwDUTXgUUFNl67wMR0DAcueSPtUkXRfyWcgGtwDJfsF1R1vdlrN
kAiSEVxOSnRb9Gg35HVjAE4Y4uYE5xZnULf6UWi65pIPUEV9nAm3i0K5chjyC/6T
VE2QagNg3FurXrzeSMJkMrTuwIW+rWHkOifQMtnJb3HwqmdhidZlErXh7Sz5qiDv
0GMqjcEjpFK0ahrmDK4Nv675HitPOQN0R9V+sYhveKeRXe43CcoIUvk6yTlLN42Q
oxl8HFLYxvZ4y+BlHuHO2sfVn6GJyO55sZWyk6k5BGVFT5RSCAjYME9jtCuSk3U=
=RIIH
-END PGP SIGNATURE-


Re: Cluster autoscaling in Spark+Mesos ?

2015-06-04 Thread Ankur Chauhan
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Yes, Breaking computations into simpler smaller jobs that run often
generally will be another way but spark will consume resource offers
as provided by mesos without any problems/extra effort.

- -- Ankur

On 04/06/2015 14:03, Tim Chen wrote:
 Spark is aware there are more resources by getting more resource
 offers and using those new offers.
 
 I don't think there is a way to refresh the Spark context for
 streaming.
 
 Tim
 
 On Thu, Jun 4, 2015 at 1:59 PM, Dmitry Goldenberg 
 dgoldenberg...@gmail.com mailto:dgoldenberg...@gmail.com
 wrote:
 
 Thanks, Ankur. I'd be curious to understand how the data exchange 
 happens in this case. How does Spark become aware of the fact that 
 machines have been added to the cluster or have been removed from 
 it?  And then, do you have some mechanism to perhaps restart the 
 Spark consumers into refreshed Spark context's which are aware of 
 the new cluster topology?
 
 On Thu, Jun 4, 2015 at 4:23 PM, Ankur Chauhan an...@malloc64.com 
 mailto:an...@malloc64.com wrote:
 
 AFAIK Mesos does not support host level auto-scaling because that
 is not the scope of the mesos-master or mesos-slave. In EC2 (like
 in my case) we have autoscaling groups set with cloudwatch metrics 
 hooked up to scaling policies. In our case, we have the following. 
 * Add 1 host per AZ when cpu load is  85% for 15 mins
 continuously. * Remove 1 host if the cpu load is  15% for 15 mins
 continuously. * Similar monitoring + scale-up/scale-down based on
 memory.
 
 All of these rules have a cooldown period of 30mins so that we
 don't end-up scaling up/down too fast.
 
 Then again, our workload is bursty (spark on mesos in fine-grained 
 mode). So, the new resources get used up and tasks distribute
 pretty fast. The above may not work in case you have long-running
 tasks (such as marathon tasks) because they would not be
 redistributed till some task restarting happens.
 
 -- Ankur
 
 On 04/06/2015 13:13, Dmitry Goldenberg wrote:
 Would it be accurate to say that Mesos helps you optimize
 resource utilization out of a preset  pool of resources,
 presumably
 servers?
 And its level of autoscaling is within that pool?
 
 
 On Jun 4, 2015, at 3:54 PM, Vinod Kone vinodk...@gmail.com
 mailto:vinodk...@gmail.com
 mailto:vinodk...@gmail.com mailto:vinodk...@gmail.com
 wrote:
 
 Hey Dmitry. At the current time there is no built-in support
 for Mesos to autoscale nodes in the cluster. I've heard people 
 (Netflix?) do it out of band on EC2.
 
 On Thu, Jun 4, 2015 at 9:08 AM, Dmitry Goldenberg 
 dgoldenberg...@gmail.com mailto:dgoldenberg...@gmail.com
 mailto:dgoldenberg...@gmail.com
 mailto:dgoldenberg...@gmail.com
 wrote:
 
 A Mesos noob here. Could someone point me at the doc or
 summary for the cluster autoscaling capabilities in Mesos?
 
 Is there a way to feed it events and have it detect the need
 to bring in more machines or decommission machines?  Is there a
 way to receive events back that notify you that machines have
 been allocated or decommissioned?
 
 Would this work within a certain set of 
 preallocated/pre-provisioned/stand-by machines or will
 Mesos go and grab machines from the cloud?
 
 What are the integration points of Apache Spark and Mesos?
 What are the true advantages of running Spark on Mesos?
 
 Can Mesos autoscale the cluster based on some signals/events 
 coming out of Spark runtime or Spark consumers, then cause the 
 consumers to run on the updated cluster, or signal to the 
 consumers to restart themselves into an updated cluster?
 
 Thanks.
 
 
 
 
 
-BEGIN PGP SIGNATURE-

iQEcBAEBAgAGBQJVcL4EAAoJEOSJAMhvLp3LozkIALzYkyY+vNhLw3Jucl/lzdrF
laUyJFhz4QJre6tUj5fRSDztLkmNLSeDNS8EZ3EyUBLMdMy9a4QbLHqkO+TrCp4X
jR4ar5BFTCC1h53tmHgHML4VGksPkQvK3dyb4DUYBhtf+sEXU/bUnEcVKlk+nCVP
VCy1H67j4UN+gPLQntEzKHou3aksd/Xr2GlQfapljps3aojnmO7W1Ytm7h1Z7UCW
kun17Bmw395PJOWPMtw93j0GZGuFGjg3h4Gbp62zquc1561xLoLR4g36zkzumA5J
bhX6uU4Z5ia0aZIuhxz8vwhTyARqPmyg7jdNqgAXseF6sjEsOk/0fUL13pv+TO4=
=E5xx
-END PGP SIGNATURE-


Re: [DISCUSS] Renaming Mesos Slave

2015-06-04 Thread Ankur Chauhan
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

+1 master/slave

James made some very good points and there is no technical reason for
wasting time on this.

On 04/06/2015 08:45, James Vanns wrote:
 +1 master/slave, no change needed.
 
 I couldn't agree more. This is a barmy request; master/slave is a
 well understood common convention (if it isn't well defined). This
 is making an issue out of something that isn't. Not at least as far
 as I see it - I don't have a habit of confusing software/systems
 nomenclature with moral high ground. This would just be a waste of
 time and not just for developers but for those adopting/who have
 adopted Mesos. If it were a brand new project at the early stages
 of just throwing ideas around, then fine - call master/slave
 whatever you want. Gru/Minion would get my vote if that were the
 case ;)
 
 Cheers,
 
 Jim
 
 
 On 4 June 2015 at 16:23, Eren Güven erenguv...@gmail.com 
 mailto:erenguv...@gmail.com wrote:
 
 +1 master/slave, no change needed
 
 Such a change is a waste of time with no technical benefit. Also 
 agree with Itamar, a breaking change like this will cause upgrade
 pains.
 
 Cheers
 
 On 4 June 2015 at 17:08, tommy xiao xia...@gmail.com 
 mailto:xia...@gmail.com wrote:
 
 +1 to James DeFelice.  I don't feel the name is confuse for any
 circumstance.
 
 2015-06-04 22:06 GMT+08:00 James DeFelice james.defel...@gmail.com
 mailto:james.defel...@gmail.com:
 
 -1 master/worker -1 master/agent -1 leader/follower
 
 +1 master/slave; no change needed
 
 There's no technical benefit **at all** to a terminology change at
 this point. If people want to change the names in their client
 presentations that's fine. Master/slave conveys specific meaning
 that is lost otherwise. In this context of this project (and
 elsewhere in Engineering-related fields) the terms are technical
 jargon and have no social implications within such context.
 
 
 On Thu, Jun 4, 2015 at 9:53 AM, Till Toenshoff toensh...@me.com
 mailto:toensh...@me.com wrote:
 
 1. Mesos Worker [node/host/machine] 2. Mesos Worker [process] 3.
 No, master/worker seems to address the issue with less changes. 
 4. Begin using the new name ASAP, add a disambiguation to the
 docs, and change old references over time. Fixing the official
 name, even before changes are in place, would be a good first
 step.
 
 +1
 
 
 
 
 -- James DeFelice 585.241.9488 tel:585.241.9488 (voice) 
 650.649.6071 tel:650.649.6071 (fax)
 
 
 
 
 -- Deshi Xiao Twitter: xds2000 E-mail: xiaods(AT)gmail.com
 http://gmail.com
 
 
 
 
 
 -- -- Senior Code Pig Industrial Light  Magic
-BEGIN PGP SIGNATURE-

iQEcBAEBAgAGBQJVcHhwAAoJEOSJAMhvLp3L8E4H/2ug5bAs5S7sZrGVZyp4vdki
tEd67eQDu1gXCV1fC6VqStnlGG9UHG95/RaCkiLLEmtbYBIY4f+6Urbwoo0P4Qyh
sU4Z0y3cdXkibH1DTIwT3tRXa/yp9Msx+KAI6NqXvfOtnLVXXtT4nKD9BCQ/+u98
afvICT1z25lBiYjBaZaVlrJRFtZkmRzVhwWiSnmtfyBfyvwbg8tEGoR1mqf3h7D5
ZpxTUvjLc1sF0NNLFTt30ReJfynOGY0tNfozi9Ubf5Hs7/3xfuHSBDVDm1+2EP4/
cHEMs2S0+54JsgSTGBGq4PGL/nKQ8vuwjzVihgQXpA3CU8QBikuvdRc/UBwDaR0=
=niNh
-END PGP SIGNATURE-


RE: Using Spark like a search engine

2015-05-25 Thread ankur chauhan
Hi,

I am sure you can use spark for this but it seems like a problem that should be 
delegated to a text based indexing technology like elastic search or something 
based on lucene to serve the requests. Spark can be used to prepare the data 
that can be fed to the indexing service. 

Using spark directly seems like there would be a lot of repeated computations 
between requests which can be avoided.

There are a bunch of spark-elasticsearch bindings that can be used to make the 
process easier. 

Again, sparksql can help you convert most of the logic directly to spark jobs 
but I would suggest exploring text indexing technologies too. 

-- ankur

-Original Message-
From: Сергей Мелехин cpro...@gmail.com
Sent: ‎5/‎24/‎2015 10:59 PM
To: user@spark.apache.org user@spark.apache.org
Subject: Using Spark like a search engine

HI!
We are developing scoring system for recruitment. Recruiter enters vacancy 
requirements, and we score tens of thousands of CVs to this requirements, and 
return e.g. top 10 matches.
We do not use fulltext search and sometimes even dont filter input CVs prior to 
scoring (some vacancies do not have mandatory requirements that can be used as 
a filter effectively).


So we have scoring function F(CV,VACANCY) that is currently inplemented in SQL 
and runs on Postgresql cluster. In worst case F is executed once on every CV in 
database. VACANCY part is fixed for one query, but changes between queries and 
there's very little we can process in advance.


We expect to have about 100 000 000 CVs in next year, and do not expect our 
current implementation to offer desired low latency responce (1 s) on 100M 
CVs. So we look for a horizontaly scaleable and fault-tolerant in-memory 
solution.


Will Spark be usefull for our task? All tutorials I could find describe stream 
processing, or ML applications. What Spark extensions/backends can be useful?




With best regards, Segey Melekhin

[jira] [Commented] (SPARK-6707) Mesos Scheduler should allow the user to specify constraints based on slave attributes

2015-05-17 Thread Ankur Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-6707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14547379#comment-14547379
 ] 

Ankur Chauhan commented on SPARK-6707:
--

I had not thought of that. Plus, this is to do with attribute based constraints 
(which are simple text based key-value pairs - 
http://mesos.apache.org/documentation/attributes-resources/ ). Please open an 
additional ticket, I may have some ideas about custom resource requirements and 
I'll document some.

 Mesos Scheduler should allow the user to specify constraints based on slave 
 attributes
 --

 Key: SPARK-6707
 URL: https://issues.apache.org/jira/browse/SPARK-6707
 Project: Spark
  Issue Type: Improvement
  Components: Mesos, Scheduler
Affects Versions: 1.3.0
Reporter: Ankur Chauhan
  Labels: mesos, scheduler

 Currently, the mesos scheduler only looks at the 'cpu' and 'mem' resources 
 when trying to determine the usablility of a resource offer from a mesos 
 slave node. It may be preferable for the user to be able to ensure that the 
 spark jobs are only started on a certain set of nodes (based on attributes). 
 For example, If the user sets a property, let's say 
 {code}spark.mesos.constraints{code} is set to 
 {code}tachyon=true;us-east-1=false{code}, then the resource offers will be 
 checked to see if they meet both these constraints and only then will be 
 accepted to start new executors.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



Re: Spark on Mesos vs Yarn

2015-05-15 Thread Ankur Chauhan
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Hi Tim,

Thanks for such a detailed email. I am excited to hear about the new
features, I had a pull request going for adding attribute based
filtering in the mesos scheduler but it hasn't received much love -
https://github.com/apache/spark/pull/5563 . I am a fan of
mesos/marathon/mesosphere and spark ecosystems and trying to push
adoption at my workplace.

It would love to see documentation, tutorials (anything actually) that
would make mesos + spark a better and more fleshed out solution. Would
it be possible for you to share some links to the JIRA and pull
requests so that I can keep track on the progress/features.

Again, thanks for replying.

- -- Ankur Chauhan

On 15/05/2015 00:39, Tim Chen wrote:
 Hi Ankur,
 
 This is a great question as I've heard similar concerns about Spark
 on Mesos.
 
 At the time when I started to contribute to Spark on Mesos approx
 half year ago, the Mesos scheduler and related code hasn't really
 got much attention from anyone and it was pretty much in
 maintenance mode.
 
 As a Mesos PMC that is really interested in Spark I started to
 refactor and check out different JIRAs and PRs around the Mesos
 scheduler, and after that started to fix various bugs in Spark,
 added documentation and also in fix related Mesos issues as well.
 
 Just recently for 1.4 we've merged in Cluster mode and Docker
 support, and there are also pending PRs around framework
 authentication, multi-role support, dynamic allocation, more finer
 tuned coarse grain mode scheduling configurations, etc.
 
 And finally just want to mention that Mesosphere and Typesafe is 
 collaborating to bring a certified distribution 
 (https://databricks.com/spark/certification/certified-spark-distributi
on)
 of Spark on Mesos and DCOS, and we will be pouring resources into
 not just maintain Spark on Mesos but drive more features into the
 Mesos scheduler and also in Mesos so stateful services can leverage
 new APIs and features to make better scheduling decisions and
 optimizations.
 
 I don't have a solidified roadmap to share yet, but we will be 
 discussing this and hopefully can share with the community soon.
 
 In summary Spark on Mesos is not dead or in maintenance mode, and
 look forward to see a lot more changes from us and the community.
 
 Tim
 
 On Thu, May 14, 2015 at 11:30 PM, Ankur Chauhan
 an...@malloc64.com mailto:an...@malloc64.com wrote:
 
 Hi,
 
 This is both a survey type as well as a roadmap query question. It 
 seems like of the cluster options to run spark (i.e. via YARN and 
 Mesos), YARN seems to be getting a lot more attention and patches
 when compared to Mesos.
 
 Would it be correct to assume that spark on mesos is more or less
 a dead or something like a maintenance-only feature and YARN is
 the recommended way to go?
 
 What is the roadmap for spark on mesos? and what is the roadmap
 for spark on yarn. I like mesos so as much as I would like to see
 it thrive I don't think spark community is active (or maybe it
 just appears that way).
 
 Another more community oriented question: what do most people use
 to run spark in production or more-than-POC products? Why did you
 make that decision?
 
 There was a similar post form early 2014 where Metei answered that 
 mesos and yarn were equally important, but has this changed as
 spark has now reached almost 1.4.0 stage?
 
 -- Ankur Chauhan
 
 -

 
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
 mailto:user-unsubscr...@spark.apache.org For additional commands,
 e-mail: user-h...@spark.apache.org 
 mailto:user-h...@spark.apache.org
 
 
-BEGIN PGP SIGNATURE-

iQEcBAEBAgAGBQJVVaXvAAoJEOSJAMhvLp3LzkIH/RLPVUoOcFo0Gij1NpZDszN4
xWvesbOZszuqD8H1Dhyndz4RQKnrodyCE+NycFB+utd9epmuyGemmHpTnq18Gek6
PR5jqmgza94dOy0rfuIVvba14ALZb4tO9SgkjyGujrpMlFYvxTjBYdYCAjfEOTx7
A/vqaCzPSBRBmO8gWx07GWa4zI70qBSZ9KnV7dgtqfUUgPKdF4NnMZWRJjTO9Bp8
tTmWMldqYPqI95wdeeqTGMH0XT6JAKAiCskf62DGadRBsOshrhmh5mAQzUFwoTpA
w4uZ+qMrTsblBvOf9z++v0eY8VBiQpOyXfOBiYiCNRtSsGa0KvqwgF1S/yLeRs0=
=4Aax
-END PGP SIGNATURE-

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Spark on Mesos vs Yarn

2015-05-15 Thread Ankur Chauhan
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Hi,

This is both a survey type as well as a roadmap query question. It
seems like of the cluster options to run spark (i.e. via YARN and
Mesos), YARN seems to be getting a lot more attention and patches when
compared to Mesos.

Would it be correct to assume that spark on mesos is more or less a
dead or something like a maintenance-only feature and YARN is the
recommended way to go?

What is the roadmap for spark on mesos? and what is the roadmap for
spark on yarn. I like mesos so as much as I would like to see it
thrive I don't think spark community is active (or maybe it just
appears that way).

Another more community oriented question: what do most people use to
run spark in production or more-than-POC products? Why did you make
that decision?

There was a similar post form early 2014 where Metei answered that
mesos and yarn were equally important, but has this changed as spark
has now reached almost 1.4.0 stage?

- -- Ankur Chauhan
-BEGIN PGP SIGNATURE-

iQEcBAEBAgAGBQJVVZKGAAoJEOSJAMhvLp3L0vEIAI4edLB2rMGk+OTI4WujxX6k
Ud5NyFUpaQ8WDjOhwcWB9RK5EoM7X3wGzRcGza1HLVnvdSUBG8Ltabt47GsP2lo0
7H9y2GluUZg/RJXbN0Ehp6moWjAU1W/55POD3t87qeUdydUJVbgDYA/KovNa6i8s
Z/e8mfvOrFSJyuJi8KW2KcfOmB1i8VZH7b/zZqtfJKNGo/0dac/gez19vVPaXPa4
WNUN8dHcp0yiZnZ0PUTYNLhI58BXBCSmkEl2Ex7X3NBUGUgJ5HGHn6dpqqNhGvf3
yPw0B0q93NcExK/E4/I75nn4vh5wKLPLWT8U5btphmc7S6h8gWFMEJRHQCdtaUk=
=uYXZ
-END PGP SIGNATURE-

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Re: kafka + Spark Streaming with checkPointing fails to restart

2015-05-14 Thread Ankur Chauhan
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Thanks everyone, that was the problem. the create new streaming
context function was supposed to setup the stream processing as well
as the checkpoint directory. I had missed the whole process of
checkpoint setup. With that done, everything works as expected.

For the benefit of others, my final version of the code that works
looks like this and it works correctly:


object RawLogProcessor extends Logging {

  import TacomaHelper._

  val checkpointDir = /tmp/checkpointDir_tacoma
  var ssc: Option[StreamingContext] = None

  def createSparkConf(config: Config): SparkConf = {
val sparkConf = new SparkConf()
config.entrySet.asScala
  .map(kv = kv.getKey - kv.getValue)
  .foreach { case (k, v) = sparkConf.set(sspark.$k,
unquote(v.render())) }

sparkConf.registerKryoClasses(Array(classOf[VideoView],
classOf[RawLog], classOf[VideoEngagement], classOf[VideoImpression]))
sparkConf
  }

  // a function that returns a function of type: `() = StreamingContext
`
  def createContext(sparkConfig: Config, kafkaConf: Config)(f:
StreamingContext = StreamingContext) = () = {
val batchDurationSecs =
sparkConfig.getDuration(streaming.batch_duration, TimeUnit.SECONDS)
val sparkConf = createSparkConf(sparkConfig)

// calculate sparkContext and streamingContext
val streamingContext = new StreamingContext(sparkConf,
Durations.seconds(batchDurationSecs))
streamingContext.checkpoint(checkpointDir)

// apply the streaming context function to the function
f(streamingContext)
  }

  def createNewContext(sparkConf: Config, kafkaConf: Config, f:
StreamingContext = StreamingContext) = {
logInfo(Create new Spark streamingContext with provided pipeline
function)
StreamingContext.getOrCreate(
  checkpointPath = checkpointDir,
  creatingFunc = createContext(sparkConf, kafkaConf)(f),
  createOnError = true)
  }

  def apply(sparkConfig: Config, kafkaConf: Config): StreamingContext =
{
rawlogTopic = kafkaConf.getString(rawlog.topic)
kafkaParams = kafkaConf.entrySet.asScala
  .map(kv = kv.getKey - unquote(kv.getValue.render()))
  .toMap

if (ssc.isEmpty) {
  ssc = Some(createNewContext(sparkConfig, kafkaConf, setupPipeline)
)
}
ssc.get
  }

  var rawlogTopic: String = qa-rawlog
  var kafkaParams: Map[String, String] = Map()

  def setupPipeline(streamingContext: StreamingContext):
StreamingContext = {

logInfo(Creating new kafka rawlog stream)
// TODO: extract this and pass it around somehow
val rawlogDStream = KafkaUtils.createDirectStream[String, Object,
StringDecoder, KafkaAvroDecoder](streamingContext, kafkaParams,
Set(rawlogTopic))

logInfo(adding step to parse kafka stream into RawLog types
(Normalizer))
val eventStream = rawlogDStream
  .map({
  case (key, rawlogVal) =
val record = rawlogVal.asInstanceOf[GenericData.Record]
val rlog = RawLog.newBuilder()
  .setId(record.get(id).asInstanceOf[String])
  .setAccount(record.get(account).asInstanceOf[String])
  .setEvent(record.get(event).asInstanceOf[String])
  .setTimestamp(record.get(timestamp).asInstanceOf[Long])
  .setUserAgent(record.get(user_agent).asInstanceOf[String])

.setParams(record.get(params).asInstanceOf[java.util.Map[String,
String]])
  .build()
val norm = Normalizer(rlog)
(key, rlog.getEvent, norm)
})

logInfo(Adding step to filter out VideoView only events and cache
them)
val videoViewStream = eventStream
  .filter(_._2 == video_view)
  .filter(_._3.isDefined)
  .map((z) = (z._1, z._3.get))
  .map((z) = (z._1, z._2.asInstanceOf[VideoView]))
  .cache()

// repartition by account
logInfo(repartition videoView by account and calculate stats)
videoViewStream.map((v) = (v._2.getAccount, 1))
  .filter(_._1 != null)
  .window(Durations.seconds(20))
  .reduceByKey(_ + _)
  .print()

// repartition by (deviceType, DeviceOS)
logInfo(repartition videoView by (DeviceType, DeviceOS) and
calculate stats)
videoViewStream.map((v) = ((v._2.getDeviceType,
v._2.getDeviceOs), 1))
  .reduceByKeyAndWindow(_ + _, Durations.seconds(10))
  .print()

streamingContext
  }

}

- - Ankur

On 13/05/2015 23:52, NB wrote:
 The data pipeline (DAG) should not be added to the StreamingContext
 in the case of a recovery scenario. The pipeline metadata is
 recovered from the checkpoint folder. That is one thing you will
 need to fix in your code. Also, I don't think the
 ssc.checkpoint(folder) call should be made in case of the
 recovery.
 
 The idiom to follow is to set up the DAG in the creatingFunc and
 not outside of it. This will ensure that if a new context is being
 created i.e. checkpoint folder does not exist, the DAG will get
 added to it and then checkpointed. Once a recovery happens, this
 function is not invoked but everything is recreated from the
 

kafka + Spark Streaming with checkPointing fails to restart

2015-05-13 Thread Ankur Chauhan
()
   val norm = Normalizer(rlog)
   (key, rlog.getEvent, norm)
   })

 val videoViewStream = eventStream
   .filter(_._2 == video_view)
   .filter(_._3.isDefined)
   .map((z) = (z._1, z._3.get))
   .map((z) = (z._1, z._2.asInstanceOf[VideoView]))
   .cache()

 // repartition by (deviceType, DeviceOS)
 val deviceTypeVideoViews = videoViewStream.map((v) =
((v._2.getDeviceType, v._2.getDeviceOs), 1))
   .reduceByKeyAndWindow(_ + _, Durations.seconds(10))
   .print()
}

object RawLogProcessor extends Logging {

 /**
  * If str is surrounded by quotes it return the content between the
quotes
  */
 def unquote(str: String) = {
   if (str != null  str.length = 2  str.charAt(0) == '\' 
str.charAt(str.length - 1) == '\')
 str.substring(1, str.length - 1)
   else
 str
 }

 val checkpointDir = /tmp/checkpointDir_tacoma
 var sparkConfig: Config = _
 var ssc: StreamingContext = _
 var processor: Option[RawLogProcessor] = None

 val createContext: () = StreamingContext = () = {
   val batchDurationSecs =
sparkConfig.getDuration(streaming.batch_duration, TimeUnit.SECONDS)
   val sparkConf = new SparkConf()
   sparkConf.registerKryoClasses(Array(classOf[VideoView],
classOf[RawLog], classOf[VideoEngagement], classOf[VideoImpression]))
   sparkConfig.entrySet.asScala
 .map(kv = kv.getKey - kv.getValue)
 .foreach {
   case (k, v) =
 val value = unquote(v.render())

 logInfo(sspark.$k = $value)

 sparkConf.set(sspark.$k, value)
 }

   // calculate sparkContext and streamingContext
   new StreamingContext(sparkConf, Durations.seconds(batchDurationSecs))
 }

 def createProcessor(sparkConf: Config, kafkaConf: Config):
RawLogProcessor = {
   sparkConfig = sparkConf
   ssc = StreamingContext.getOrCreate(checkpointPath = checkpointDir,
creatingFunc = createContext, createOnError = true)
   ssc.checkpoint(checkpointDir)
   // kafkaProperties
   val kafkaParams = kafkaConf.entrySet.asScala
 .map(kv = kv.getKey - unquote(kv.getValue.render()))
 .toMap

   logInfo(sInitializing kafkaParams = $kafkaParams)
   // create processor
   new RawLogProcessor(ssc, kafkaConf.getString(rawlog.topic),
kafkaParams)
 }

 def apply(sparkConfig: Config, kafkaConf: Config) = {
   if (processor.isEmpty) {
 processor = Some(createProcessor(sparkConfig, kafkaConf))
   }
   processor.get
 }

 def start() = {
   ssc.start()
   ssc.awaitTermination()
 }

}

Extended logs:
https://gist.githubusercontent.com/ankurcha/f35df63f0d8a99da0be4/raw/ec9
6b932540ac87577e4ce8385d26699c1a7d05e/spark-console.log

Could someone tell me what it causes this problem? I tried looking at
the stacktrace but I am not very familiar with the codebase to make
solid assertions.
Any ideas as to what may be happening here.

- --- Ankur Chauhan
-BEGIN PGP SIGNATURE-

iQEcBAEBAgAGBQJVUxERAAoJEOSJAMhvLp3LioIIAMFnkzxKRh0h4GNehdokbdC+
UU4SGS6gXpRX80nsTMlrlqv49SmRezTL+djtLKF9JSbaYe2MnF6OX0NDxtH63C0U
5aUWUhOE9GW5Kfi3Ib54Y99W3w/SyD3ed4AcKc5o62MNOifrxi3JdcrWS3rokJea
ayyKhC1c/a0XOozUqrKWHmKog45ZDaNQOFyeBaBdnvqWxoD2rOBmL4AZ6rKkS4o1
/hgmQDeG4gB+AVEjFJcCBjJ/W+YQP+HWgyPFnUd5ZiHk7oFpfFjE+XZHIMPIczez
zGOaL8BrZifSKi/e00kBQI5LWWUIk9PRehjwSnW1LRzKADMBUpsY8HOteri9Ow0=
=kqaQ
-END PGP SIGNATURE-

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Re: how to monitor multi directories in spark streaming task

2015-05-13 Thread Ankur Chauhan
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Hi,

You could retroactively union an existing DStream with one from a
newly created file. Then when another file is detected, you would
need to re-union the stream an create another DStream. It seems like
the implementation of FileInputDStream only looks for files in the
directory and the filtering is applied using
FileSystem.listStatus(dir, filter) method which does not provide
recursive listing.

A cleaner solution would be to extend FileInputDStream and override
the findNewFiles(...) with the ability to recursively list files
(probably by using FileSystem.listFiles.

Refer: http://stackoverflow.com/a/25645225/113411

- -- Ankur


On 13/05/2015 02:03, lisendong wrote:
 but in fact the directories are not ready at the beginning to my
 task .
 
 for example:
 
 /user/root/2015/05/11/data.txt /user/root/2015/05/12/data.txt 
 /user/root/2015/05/13/data.txt
 
 like this.
 
 and one new directory one day.
 
 how to create the new DStream for tomorrow’s new
 directory(/user/root/2015/05/13/) ??
 
 
 在 2015年5月13日,下午4:59,Ankur Chauhan achau...@brightcove.com 写道:
 
 I would suggest creating one DStream per directory and then using 
 StreamingContext#union(...) to get a union DStream.
 
 -- Ankur
 
 On 13/05/2015 00:53, hotdog wrote:
 I want to use use fileStream in spark streaming to monitor
 multi hdfs directories, such as:
 
 val list_join_action_stream = ssc.fileStream[LongWritable,
 Text, TextInputFormat](/user/root/*/*, check_valid_file(_),
  false).map(_._2.toString).print
 
 
 Buy the way, i could not under the meaning of the three class
 : LongWritable, Text, TextInputFormat
 
 but it doesn't work...
 
 
 
 -- View this message in context: 
 http://apache-spark-user-list.1001560.n3.nabble.com/how-to-monitor-
mul

 
ti-directories-in-spark-streaming-task-tp22863.html
 
 
 Sent from the Apache Spark User List mailing list archive at
 Nabble.com.
 
 ---
- --



 
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
 For additional commands, e-mail: user-h...@spark.apache.org
 
 [attachment]
 
 0x6D461C4A.asc download: http://u.163.com/t0/fqZhSPbA
 
 preview: http://u.163.com/t0/2LRiaRy
 
 
 0x6D461C4A.asc.sig download: http://u.163.com/t0/Ij1N9
 
 0x6D461C4A.asc0x6D461C4A.asc.sig
 
 
-BEGIN PGP SIGNATURE-

iQEbBAEBAgAGBQJVUxl2AAoJEOSJAMhvLp3L4dsH+KxSz/YF7UUiwZDiP36umD1X
3LVU2Io3CGVRDI4OEYs1mvSE2DqMx820DHApl0VxxkYdLmAPUtaAc1zAtWOPgiqQ
GuL0jfdwkVGOBsbF6cycJe6XWMbJUyty0tU1IsvS23OvuhKD2ulgBJieyY/quvSs
dIdFDu4bNhVhuz1KN+Vm44cdfZ/rHchOoaOnSej5zOglSerr/hTFyGZUdalAYMxq
t2P2M2mkHrlqHqqt4EMtEOyi6iDvVPaiaJB8NQ6xbBDs9fSmv3noB5fl19hPc9gk
8G4JbzZkD01Nh2ZRZgH1voE7NPI4P/Z6UTSJBR9qdIgtinoP5JLSBNpRew4WuA==
=7vh9
-END PGP SIGNATURE-

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Re: how to monitor multi directories in spark streaming task

2015-05-13 Thread Ankur Chauhan
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

I would suggest creating one DStream per directory and then using
StreamingContext#union(...) to get a union DStream.

- -- Ankur

On 13/05/2015 00:53, hotdog wrote:
 I want to use use fileStream in spark streaming to monitor multi
 hdfs directories, such as:
 
 val list_join_action_stream = ssc.fileStream[LongWritable, Text, 
 TextInputFormat](/user/root/*/*, check_valid_file(_), 
 false).map(_._2.toString).print
 
 
 Buy the way, i could not under the meaning of the three class : 
 LongWritable, Text, TextInputFormat
 
 but it doesn't work...
 
 
 
 -- View this message in context:
 http://apache-spark-user-list.1001560.n3.nabble.com/how-to-monitor-mul
ti-directories-in-spark-streaming-task-tp22863.html

 
Sent from the Apache Spark User List mailing list archive at Nabble.com.
 
 -

 
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
 For additional commands, e-mail: user-h...@spark.apache.org
 
-BEGIN PGP SIGNATURE-

iQEcBAEBAgAGBQJVUxJ1AAoJEOSJAMhvLp3L2f4IAKK+ouQ2VD7H6s/5w/YGbt2P
uBGJPQ92Hb5REq3f4gK4YecygtAlSAwsqXGCoAaaoPAC7vUMs9RM+slqse1gmUPU
pbORTIB9dv3iVxjPtZ6R8EX14BAlxcIOR6ni2RBHuQTL+dgIEUekmCg0IhFa5lVF
Kt5in8rY5PSnX5l/dX9Yu8LI3uC4TLQ+eJXjjOGXoCHys+SaZWJckA3gVcF9GQdB
dwdhv4UCIYVFj3QIVlLf0+B8FgA0DnRfBC+5ZfS88gcWMc4065sDdx5LkySy4oZB
tB8IpC4yaY3Mqiu8jdvhcw+SevlYan5YkkkutSvKH7nL/0d1WIkEkHxPBjRqAmY=
=U0oQ
-END PGP SIGNATURE-


0x6D461C4A.asc
Description: application/pgp-keys


0x6D461C4A.asc.sig
Description: Binary data

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

kafka + Spark Streaming with checkPointing fails to restart

2015-05-13 Thread Ankur Chauhan
()
   val norm = Normalizer(rlog)
   (key, rlog.getEvent, norm)
   })

 val videoViewStream = eventStream
   .filter(_._2 == video_view)
   .filter(_._3.isDefined)
   .map((z) = (z._1, z._3.get))
   .map((z) = (z._1, z._2.asInstanceOf[VideoView]))
   .cache()

 // repartition by (deviceType, DeviceOS)
 val deviceTypeVideoViews = videoViewStream.map((v) =
((v._2.getDeviceType, v._2.getDeviceOs), 1))
   .reduceByKeyAndWindow(_ + _, Durations.seconds(10))
   .print()
}

object RawLogProcessor extends Logging {

 /**
  * If str is surrounded by quotes it return the content between the
quotes
  */
 def unquote(str: String) = {
   if (str != null  str.length = 2  str.charAt(0) == '\' 
str.charAt(str.length - 1) == '\')
 str.substring(1, str.length - 1)
   else
 str
 }

 val checkpointDir = /tmp/checkpointDir_tacoma
 var sparkConfig: Config = _
 var ssc: StreamingContext = _
 var processor: Option[RawLogProcessor] = None

 val createContext: () = StreamingContext = () = {
   val batchDurationSecs =
sparkConfig.getDuration(streaming.batch_duration, TimeUnit.SECONDS)
   val sparkConf = new SparkConf()
   sparkConf.registerKryoClasses(Array(classOf[VideoView],
classOf[RawLog], classOf[VideoEngagement], classOf[VideoImpression]))
   sparkConfig.entrySet.asScala
 .map(kv = kv.getKey - kv.getValue)
 .foreach {
   case (k, v) =
 val value = unquote(v.render())

 logInfo(sspark.$k = $value)

 sparkConf.set(sspark.$k, value)
 }

   // calculate sparkContext and streamingContext
   new StreamingContext(sparkConf, Durations.seconds(batchDurationSecs))
 }

 def createProcessor(sparkConf: Config, kafkaConf: Config):
RawLogProcessor = {
   sparkConfig = sparkConf
   ssc = StreamingContext.getOrCreate(checkpointPath = checkpointDir,
creatingFunc = createContext, createOnError = true)
   ssc.checkpoint(checkpointDir)
   // kafkaProperties
   val kafkaParams = kafkaConf.entrySet.asScala
 .map(kv = kv.getKey - unquote(kv.getValue.render()))
 .toMap

   logInfo(sInitializing kafkaParams = $kafkaParams)
   // create processor
   new RawLogProcessor(ssc, kafkaConf.getString(rawlog.topic),
kafkaParams)
 }

 def apply(sparkConfig: Config, kafkaConf: Config) = {
   if (processor.isEmpty) {
 processor = Some(createProcessor(sparkConfig, kafkaConf))
   }
   processor.get
 }

 def start() = {
   ssc.start()
   ssc.awaitTermination()
 }

}

Extended logs:
https://gist.githubusercontent.com/ankurcha/f35df63f0d8a99da0be4/raw/ec9
6b932540ac87577e4ce8385d26699c1a7d05e/spark-console.log

Could someone tell me what it causes this problem? I tried looking at
the stacktrace but I am not very familiar with the codebase to make
solid assertions.
Any ideas as to what may be happening here.

- --- Ankur Chauhan
-BEGIN PGP SIGNATURE-

iQEcBAEBAgAGBQJVUxNrAAoJEOSJAMhvLp3LQOYH/A/s82dg4ogYspV3OBkvY3uT
qcTrFDrOou7qePoRhQecb7TUyA9pN6e/SWfNJ+25ZXM20tgLYqG0GFx3ZRmJoyMt
/DeeQHyC4ZnfsxVT2DQksrRa0yeXGQNKXwPuq6dQ8aCfgj5FjpYFaZ1uhsim+IR7
7HSSXBwoyMVnxRWzMirLxd3UAYVHd64/csz6b3Mwn0CqrRIaDu7vgoO7H7PMmve/
fzYx3o8Whvkiw71ZTAt80sd9KRAF0/hZtXv7KrKUjpfjRD0QAJj6fBJbgQZHmwJ1
a7PCUcRsX+v0SNcT9AEvubk+UR/ZzD9hoH8HcA6K4VjAdpptc2GZAnOiecyinQU=
=0Lkk
-END PGP SIGNATURE-

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



data schema and serialization format suggestions

2015-05-13 Thread Ankur Chauhan
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Hi all,

I want to get a general idea about best practices around data format
serialization. Currently, I am using avro as the data serialization
format but the emitted types aren't very scala friendly. So, I was
wondering how others deal with this problem.

At the high level, the requirements are fairly simple:

1. Simple and easy to understand and extend.
2. Usable in places other than spark. ( I would want to use them in
other applications and tools ).
3. Ability to play nice with parquet and Kafka (nice to have).

- -- Ankur Chauhan
-BEGIN PGP SIGNATURE-

iQEcBAEBAgAGBQJVU4spAAoJEOSJAMhvLp3LYYQH/A3AvjBhclydO4pqxoLzUSiK
e844eteYEHqLSiwrFKxG8I8uLDLRYVvln0XnmKnEqoaGwNbqC5IbqovNKE8FwFzk
F7XU30O7CEgExBrSXsv7nSFq/BBSELnCGpuszf92lF2XRaFtOz0kJZ7YOf+IIZEn
mV4K4IodJhkCzX3oeAO/3PwzMlUXTV+qDDA9pa6pwrQ2TKRMYs8HTzAf526cL5/F
0RdFHse1JjOWDiiaCCI1aHNbtRM/TJvsyGcLlqNDjeOVFcTFcKU3QQLxydfqNRK6
JYn9jY/NRqseePaW4L9BqEBSQ7aTNRk1P88r5+8FFes2qImkgQ5+VIw+DlwEePM=
=LwBs
-END PGP SIGNATURE-

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Spark streaming updating a large window more frequently

2015-05-08 Thread Ankur Chauhan
Hi,

I am pretty new to spark/spark_streaming so please excuse my naivety. I have 
streaming event stream that is timestamped and I would like to aggregate it 
into, let's say, hourly buckets. Now the simple answer is to use a window 
operation with window length of 1 hr and sliding interval of 1hr. But this sort 
of doesn't exactly work:

1. The time boundaries aren't exactly perfect. i.e. the process/stream 
aggreagation may get started at the middle of the hour so the 1st hour may 
actually be less than 1 hour long and then subsequent hours should be aligned 
to the next hour.
2. The If I understand this correctly, the above method would mean that all my 
data is collected for 1 hour and then summarised. Though correct, how do I 
get the aggregations to occur more frequently than that. Something like 
aggregate these events into hourly buckets updating it every 5 seconds.

I would really appreciate pointers to code samples or some blogs that could 
help me identify best practices.

-- Ankur Chauhan


signature.asc
Description: Message signed with OpenPGP using GPGMail


Re: history server

2015-05-07 Thread Ankur Chauhan
Hi,

Sorry this may be a little off topic but I tried searching for docs on history 
server but couldn't really find much. Can someone point me to a doc or give me 
a point of reference for the use and intent of a history server?


-- Ankur

 On 7 May 2015, at 12:06, Koert Kuipers ko...@tresata.com wrote:
 
 got it. thanks!
 
 On Thu, May 7, 2015 at 2:52 PM, Marcelo Vanzin van...@cloudera.com wrote:
 Ah, sorry, that's definitely what Shixiong mentioned. The patch I mentioned 
 did not make it into 1.3...
 
 On Thu, May 7, 2015 at 11:48 AM, Koert Kuipers ko...@tresata.com wrote:
 seems i got one thread spinning 100% for a while now, in 
 FsHistoryProvider.initialize(). maybe something wrong with my logs on hdfs 
 that its reading? or could it simply really take 30 mins to read all the 
 history on dhfs?
 
 jstack:
 
 Deadlock Detection:
 
 No deadlocks found.
 
 Thread 2272: (state = BLOCKED)
  - sun.misc.Unsafe.park(boolean, long) @bci=0 (Compiled frame; information 
 may be imprecise)
  - java.util.concurrent.locks.LockSupport.parkNanos(java.lang.Object, long) 
 @bci=20, line=226 (Compiled frame)
  - 
 java.util.concurrent.SynchronousQueue$TransferStack.awaitFulfill(java.util.concurrent.SynchronousQueue$TransferStack$SNode,
  boolean, long) @bci=174, line=460 (Compiled frame)
  - 
 java.util.concurrent.SynchronousQueue$TransferStack.transfer(java.lang.Object,
  boolean, long) @bci=102, line=359 (Interpreted frame)
  - java.util.concurrent.SynchronousQueue.poll(long, 
 java.util.concurrent.TimeUnit) @bci=11, line=942 (Interpreted frame)
  - java.util.concurrent.ThreadPoolExecutor.getTask() @bci=141, line=1068 
 (Interpreted frame)
  - 
 java.util.concurrent.ThreadPoolExecutor.runWorker(java.util.concurrent.ThreadPoolExecutor$Worker)
  @bci=26, line=1130 (Interpreted frame)
  - java.util.concurrent.ThreadPoolExecutor$Worker.run() @bci=5, line=615 
 (Interpreted frame)
  - java.lang.Thread.run() @bci=11, line=745 (Interpreted frame)
 
 
 Thread 1986: (state = BLOCKED)
  - java.lang.Thread.sleep(long) @bci=0 (Interpreted frame)
  - org.apache.hadoop.hdfs.PeerCache.run() @bci=41, line=250 (Interpreted 
 frame)
  - 
 org.apache.hadoop.hdfs.PeerCache.access$000(org.apache.hadoop.hdfs.PeerCache) 
 @bci=1, line=41 (Interpreted frame)
  - org.apache.hadoop.hdfs.PeerCache$1.run() @bci=4, line=119 (Interpreted 
 frame)
  - java.lang.Thread.run() @bci=11, line=745 (Interpreted frame)
 
 
 Thread 1970: (state = BLOCKED)
 
 
 Thread 1969: (state = BLOCKED)
  - java.lang.Object.wait(long) @bci=0 (Interpreted frame)
  - java.lang.ref.ReferenceQueue.remove(long) @bci=44, line=135 (Interpreted 
 frame)
  - java.lang.ref.ReferenceQueue.remove() @bci=2, line=151 (Interpreted frame)
  - java.lang.ref.Finalizer$FinalizerThread.run() @bci=36, line=209 
 (Interpreted frame)
 
 
 Thread 1968: (state = BLOCKED)
  - java.lang.Object.wait(long) @bci=0 (Interpreted frame)
  - java.lang.Object.wait() @bci=2, line=503 (Interpreted frame)
  - java.lang.ref.Reference$ReferenceHandler.run() @bci=46, line=133 
 (Interpreted frame)
 
 
 Thread 1958: (state = IN_VM)
  - java.lang.Throwable.fillInStackTrace(int) @bci=0 (Compiled frame; 
 information may be imprecise)
  - java.lang.Throwable.fillInStackTrace() @bci=16, line=783 (Compiled frame)
  - java.lang.Throwable.init(java.lang.String, java.lang.Throwable) @bci=24, 
 line=287 (Compiled frame)
  - java.lang.Exception.init(java.lang.String, java.lang.Throwable) @bci=3, 
 line=84 (Compiled frame)
  - org.json4s.package$MappingException.init(java.lang.String, 
 java.lang.Exception) @bci=13, line=56 (Compiled frame)
  - org.json4s.reflect.package$.fail(java.lang.String, java.lang.Exception) 
 @bci=6, line=96 (Compiled frame)
  - org.json4s.Extraction$.convert(org.json4s.JsonAST$JValue, 
 org.json4s.reflect.ScalaType, org.json4s.Formats, scala.Option) @bci=2447, 
 line=554 (Compiled frame)
  - org.json4s.Extraction$.extract(org.json4s.JsonAST$JValue, 
 org.json4s.reflect.ScalaType, org.json4s.Formats) @bci=796, line=331 
 (Compiled frame)
  - org.json4s.Extraction$.extract(org.json4s.JsonAST$JValue, 
 org.json4s.Formats, scala.reflect.Manifest) @bci=10, line=42 (Compiled frame)
  - org.json4s.Extraction$.extractOpt(org.json4s.JsonAST$JValue, 
 org.json4s.Formats, scala.reflect.Manifest) @bci=7, line=54 (Compiled frame)
  - org.json4s.ExtractableJsonAstNode.extractOpt(org.json4s.Formats, 
 scala.reflect.Manifest) @bci=9, line=40 (Compiled frame)
  - 
 org.apache.spark.util.JsonProtocol$.shuffleWriteMetricsFromJson(org.json4s.JsonAST$JValue)
  @bci=116, line=702 (Compiled frame)
  - 
 org.apache.spark.util.JsonProtocol$$anonfun$taskMetricsFromJson$2.apply(org.json4s.JsonAST$JValue)
  @bci=4, line=670 (Compiled frame)
  - 
 org.apache.spark.util.JsonProtocol$$anonfun$taskMetricsFromJson$2.apply(java.lang.Object)
  @bci=5, line=670 (Compiled frame)
  - scala.Option.map(scala.Function1) @bci=22, line=145 (Compiled frame)
  - 
 

Re: Nightly builds/releases?

2015-05-04 Thread Ankur Chauhan
Hi,

There is also a make-distribution.sh file in the repository root. If someone 
with jenkins access can create a simple builder that would be awesome.
But I am guessing besides the spark binary one would also probably want the 
maven artifacts (lower priority though) to work with it.

-- Ankur
 On 4 May 2015, at 20:11, Ted Yu yuzhih...@gmail.com wrote:
 
 See this related thread:
 http://search-hadoop.com/m/JW1q5bnnyT1
 
 Cheers
 
 On Mon, May 4, 2015 at 7:58 PM, Guru Medasani gdm...@gmail.com wrote:
 I see a Jira for this one, but unresolved.
 
 https://issues.apache.org/jira/browse/SPARK-1517
 
 
 
 
 On May 4, 2015, at 10:25 PM, Ankur Chauhan achau...@brightcove.com wrote:
 
 Hi,
 
 Does anyone know if spark has any nightly builds or equivalent that provides 
 binaries that have passed a CI build so that one could try out the bleeding 
 edge without having to compile.
 
 -- Ankur
 
 



signature.asc
Description: Message signed with OpenPGP using GPGMail


Spark + Mesos + HDFS resource split

2015-04-27 Thread Ankur Chauhan
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Hi,

I am building a mesos cluster for the purposes of using it to run
spark workloads (in addition to other frameworks). I am under the
impression that it is preferable/recommended to run hdfs datanode
process, spark slave on the same physical node (or EC2 instance or VM).

My question is: What is the recommended resource splitting? How much
memory and CPU should I preallocate for HDFS and how much should I set
aside as allocatable by mesos? In addition, is there some
rule-of-thumb recommendation around this?

- -- Ankur Chauhan
-BEGIN PGP SIGNATURE-

iQEcBAEBAgAGBQJVPgiLAAoJEOSJAMhvLp3L2fEIANkmfTjzEhjQ1IEc5W59F8sP
mT06qpxnd3XPg8DFOPIKxxCAtsVU1fImAOnFYobi9mQlEzcEbDtPMoLh0uStFIIS
cuorK4j0Am8Y1xxYa8BhKuWEtpYoFtSEYIF5eHe5vNlt5FlEvs3vTJ3N/zFbxVsq
I0FQH8r9u27pBJ9/rACyruYhgh/b5Tc6s39uKDFFJnhDWezMF2sF1WCgcIbZRP4+
PAhqLNPuVNAPcpi9JAe8u91d8yeFFVb/00mO2am2cr0BcHnfeWq6ZFftZUQrX3PK
FvD7FpfeFLCS5FinDqMHp2nkGetlJMQsIYRzvn3tmim8OeE6ppFsO0LnRNEqEtQ=
=I22x
-END PGP SIGNATURE-

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



[jira] [Commented] (SPARK-6707) Mesos Scheduler should allow the user to specify constraints based on slave attributes

2015-04-17 Thread Ankur Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-6707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14500682#comment-14500682
 ] 

Ankur Chauhan commented on SPARK-6707:
--

I have a patch (pull request: https://github.com/apache/spark/pull/5563) for 
this but I need some help figuring out some good tests for it. I would really 
appreciate some feedback.

 Mesos Scheduler should allow the user to specify constraints based on slave 
 attributes
 --

 Key: SPARK-6707
 URL: https://issues.apache.org/jira/browse/SPARK-6707
 Project: Spark
  Issue Type: Improvement
  Components: Mesos, Scheduler
Affects Versions: 1.3.0
Reporter: Ankur Chauhan
  Labels: mesos, scheduler

 Currently, the mesos scheduler only looks at the 'cpu' and 'mem' resources 
 when trying to determine the usablility of a resource offer from a mesos 
 slave node. It may be preferable for the user to be able to ensure that the 
 spark jobs are only started on a certain set of nodes (based on attributes). 
 For example, If the user sets a property, let's say 
 {code}spark.mesos.constraints{code} is set to 
 {code}tachyon=true;us-east-1=false{code}, then the resource offers will be 
 checked to see if they meet both these constraints and only then will be 
 accepted to start new executors.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-6707) Mesos Scheduler should allow the user to specify constraints based on slave attributes

2015-04-04 Thread Ankur Chauhan (JIRA)
Ankur Chauhan created SPARK-6707:


 Summary: Mesos Scheduler should allow the user to specify 
constraints based on slave attributes
 Key: SPARK-6707
 URL: https://issues.apache.org/jira/browse/SPARK-6707
 Project: Spark
  Issue Type: Improvement
  Components: Mesos, Scheduler
Affects Versions: 1.3.0
Reporter: Ankur Chauhan






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-6707) Mesos Scheduler should allow the user to specify constraints based on slave attributes

2015-04-04 Thread Ankur Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-6707?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ankur Chauhan updated SPARK-6707:
-
Description: 
Currently, the mesos scheduler only looks at the `cpu` and `mem` resources when 
trying to determine the usablility of a resource offer from a mesos slave node. 
It may be preferable for the user to be able to ensure that the spark jobs are 
only started on a certain set of nodes (based on attributes). 

For example, If the user sets a property, let's say 
{code}spark.mesos.constraints{code} is set to 
{code}tachyon=true;us-east-1=false{code}, then the resource offers will be 
checked to see if they meet both these constraints and only then will be 
accepted to start new executors.

  was:
Currently, the mesos scheduler only looks at the `cpu` and `mem` resources when 
trying to determine the usablility of a resource offer from a mesos slave node. 
It may be preferable for the user to be able to ensure that the spark jobs are 
only started on a certain set of nodes (based on attributes). 

For example, If the user sets a property, let's say `spark.mesos.constraints` 
is set to `tachyon=true;us-east-1=false`, then the resource offers will be 
checked to see if they meet both these constraints and only then will be 
accepted to start new executors.


 Mesos Scheduler should allow the user to specify constraints based on slave 
 attributes
 --

 Key: SPARK-6707
 URL: https://issues.apache.org/jira/browse/SPARK-6707
 Project: Spark
  Issue Type: Improvement
  Components: Mesos, Scheduler
Affects Versions: 1.3.0
Reporter: Ankur Chauhan
  Labels: mesos, scheduler

 Currently, the mesos scheduler only looks at the `cpu` and `mem` resources 
 when trying to determine the usablility of a resource offer from a mesos 
 slave node. It may be preferable for the user to be able to ensure that the 
 spark jobs are only started on a certain set of nodes (based on attributes). 
 For example, If the user sets a property, let's say 
 {code}spark.mesos.constraints{code} is set to 
 {code}tachyon=true;us-east-1=false{code}, then the resource offers will be 
 checked to see if they meet both these constraints and only then will be 
 accepted to start new executors.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-6707) Mesos Scheduler should allow the user to specify constraints based on slave attributes

2015-04-04 Thread Ankur Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-6707?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ankur Chauhan updated SPARK-6707:
-
Description: 
Currently, the mesos scheduler only looks at the 'cpu' and 'mem' resources when 
trying to determine the usablility of a resource offer from a mesos slave node. 
It may be preferable for the user to be able to ensure that the spark jobs are 
only started on a certain set of nodes (based on attributes). 

For example, If the user sets a property, let's say 
{code}spark.mesos.constraints{code} is set to 
{code}tachyon=true;us-east-1=false{code}, then the resource offers will be 
checked to see if they meet both these constraints and only then will be 
accepted to start new executors.

  was:
Currently, the mesos scheduler only looks at the `cpu` and `mem` resources when 
trying to determine the usablility of a resource offer from a mesos slave node. 
It may be preferable for the user to be able to ensure that the spark jobs are 
only started on a certain set of nodes (based on attributes). 

For example, If the user sets a property, let's say 
{code}spark.mesos.constraints{code} is set to 
{code}tachyon=true;us-east-1=false{code}, then the resource offers will be 
checked to see if they meet both these constraints and only then will be 
accepted to start new executors.


 Mesos Scheduler should allow the user to specify constraints based on slave 
 attributes
 --

 Key: SPARK-6707
 URL: https://issues.apache.org/jira/browse/SPARK-6707
 Project: Spark
  Issue Type: Improvement
  Components: Mesos, Scheduler
Affects Versions: 1.3.0
Reporter: Ankur Chauhan
  Labels: mesos, scheduler

 Currently, the mesos scheduler only looks at the 'cpu' and 'mem' resources 
 when trying to determine the usablility of a resource offer from a mesos 
 slave node. It may be preferable for the user to be able to ensure that the 
 spark jobs are only started on a certain set of nodes (based on attributes). 
 For example, If the user sets a property, let's say 
 {code}spark.mesos.constraints{code} is set to 
 {code}tachyon=true;us-east-1=false{code}, then the resource offers will be 
 checked to see if they meet both these constraints and only then will be 
 accepted to start new executors.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



Re: spark mesos deployment : starting workers based on attributes

2015-04-04 Thread Ankur Chauhan
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Hi,

Created issue: https://issues.apache.org/jira/browse/SPARK-6707
I would really appreciate ideas/views/opinions on this feature.

- -- Ankur Chauhan

On 03/04/2015 13:23, Tim Chen wrote:
 Hi Ankur,
 
 There isn't a way to do that yet, but it's simple to add.
 
 Can you create a JIRA in Spark for this?
 
 Thanks!
 
 Tim
 
 On Fri, Apr 3, 2015 at 1:08 PM, Ankur Chauhan
 achau...@brightcove.com mailto:achau...@brightcove.com wrote:
 
 Hi,
 
 I am trying to figure out if there is a way to tell the mesos 
 scheduler in spark to isolate the workers to a set of mesos slaves 
 that have a given attribute such as `tachyon:true`.
 
 Anyone knows if that is possible or how I could achieve such a
 behavior.
 
 Thanks! -- Ankur Chauhan
 
 -

 
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
 mailto:user-unsubscr...@spark.apache.org For additional commands,
 e-mail: user-h...@spark.apache.org 
 mailto:user-h...@spark.apache.org
 
 

- -- 
- -- Ankur Chauhan
-BEGIN PGP SIGNATURE-

iQEcBAEBAgAGBQJVH4xBAAoJEOSJAMhvLp3LMfsH/0oyQ4fGomCd8GnQzqVrZ6zj
cgwhOyntz5aaBdjipVez1EwzNzG/3kXzFnK3YpuT6SXdXuPLSD6NX62ju/Ii+86w
/Y15taXt1qo+Ah6CLkofCPAPY1HRCZ+KAM/KzW45W+uGvcUqyupPFUEvN/a9/hYC
Ok7AERk8Tw/CRoU/Fbz/23LxjK1TJUW1klaToUjyij2oakMUxT7HnqS08fCUBJF6
pEqXJ+gHGW3br6BJcvwce7my8bFlPShVP+exhcNhpmqjoRvSf//etmP2E0Me2hXM
ZmghjIqRhoAI4sJYIhEBBQS7r4AsI5FQyNkr8i4Hqed4dq61YA7FcpUCC+GDbTY=
=pVkB
-END PGP SIGNATURE-

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



[jira] [Updated] (SPARK-6707) Mesos Scheduler should allow the user to specify constraints based on slave attributes

2015-04-04 Thread Ankur Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-6707?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ankur Chauhan updated SPARK-6707:
-
Description: 
Currently, the mesos scheduler only looks at the `cpu` and `mem` resources when 
trying to determine the usablility of a resource offer from a mesos slave node. 
It may be preferable for the user to be able to ensure that the spark jobs are 
only started on a certain set of nodes (based on attributes). 

For example, If the user sets a property, let's say `spark.mesos.constraints` 
is set to `tachyon=true;us-east-1=false`, then the resource offers will be 
checked to see if they meet both these constraints and only then will be 
accepted to start new executors.

 Mesos Scheduler should allow the user to specify constraints based on slave 
 attributes
 --

 Key: SPARK-6707
 URL: https://issues.apache.org/jira/browse/SPARK-6707
 Project: Spark
  Issue Type: Improvement
  Components: Mesos, Scheduler
Affects Versions: 1.3.0
Reporter: Ankur Chauhan
  Labels: mesos, scheduler

 Currently, the mesos scheduler only looks at the `cpu` and `mem` resources 
 when trying to determine the usablility of a resource offer from a mesos 
 slave node. It may be preferable for the user to be able to ensure that the 
 spark jobs are only started on a certain set of nodes (based on attributes). 
 For example, If the user sets a property, let's say `spark.mesos.constraints` 
 is set to `tachyon=true;us-east-1=false`, then the resource offers will be 
 checked to see if they meet both these constraints and only then will be 
 accepted to start new executors.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



Re: spark mesos deployment : starting workers based on attributes

2015-04-03 Thread Ankur Chauhan
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Hi,

Thanks! I'll add the JIRA. I'll also try to work on a patch this weekend
.

- -- Ankur Chauhan

On 03/04/2015 13:23, Tim Chen wrote:
 Hi Ankur,
 
 There isn't a way to do that yet, but it's simple to add.
 
 Can you create a JIRA in Spark for this?
 
 Thanks!
 
 Tim
 
 On Fri, Apr 3, 2015 at 1:08 PM, Ankur Chauhan
 achau...@brightcove.com mailto:achau...@brightcove.com wrote:
 
 Hi,
 
 I am trying to figure out if there is a way to tell the mesos 
 scheduler in spark to isolate the workers to a set of mesos slaves 
 that have a given attribute such as `tachyon:true`.
 
 Anyone knows if that is possible or how I could achieve such a
 behavior.
 
 Thanks! -- Ankur Chauhan
 
 -

 
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
 mailto:user-unsubscr...@spark.apache.org For additional commands,
 e-mail: user-h...@spark.apache.org 
 mailto:user-h...@spark.apache.org
 
 

- -- 
- -- Ankur Chauhan
-BEGIN PGP SIGNATURE-

iQEcBAEBAgAGBQJVHxMDAAoJEOSJAMhvLp3LEPAH/1T7Ywu2W2vEZR/f6KbP+xbd
CiECqbgy1lMw0TxK3jyoiGttTL0uDcgoqev5kjaUFaGgcpsbzZg2jiaqM5RagJRv
55HvGXtSXKQ3l5NlRyMsbmRGVu8qoV2qv2qrCQHLKhVc0ipXEQgSjrkDGx9yP397
Dz1tFMsY/bgvQL0nMAm/HwJokv701IDGeFXFNI4GXhLGcARYDHou4bY0nzZq+w8t
V9vEFji4jyroJmacHdX0np3KsA6tzVItD6Wi9tLKr0+UWDw2Fb1HfYK0CPYX+FK8
dEgZ/hKwNolAzfIF6kHyNKEIf6H6GKihdLxaB23Im7QojvgGNBTqfGV4tGoJLPc=
=KyHk
-END PGP SIGNATURE-

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



spark mesos deployment : starting workers based on attributes

2015-04-03 Thread Ankur Chauhan
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Hi,

I am trying to figure out if there is a way to tell the mesos
scheduler in spark to isolate the workers to a set of mesos slaves
that have a given attribute such as `tachyon:true`.

Anyone knows if that is possible or how I could achieve such a behavior.

Thanks!
- -- Ankur Chauhan
-BEGIN PGP SIGNATURE-

iQEcBAEBAgAGBQJVHvMlAAoJEOSJAMhvLp3LaV0H/jtX+KQDyorUESLIKIxFV9KM
QjyPtVquwuZYcwLqCfQbo62RgE/LeTjjxzifTzMM5D6cf4ULBH1TcS3Is2EdOhSm
UTMfJyvK06VFvYMLiGjqN4sBG3DFdamQif18qUJoKXX/Z9cUQO9SaSjIezSq2gd8
0lM3NLEQjsXY5uRJyl9GYDxcFsXPVzt1crXAdrtVsIYAlFmhcrm1n/5+Peix89Oh
vgK1J7e0ei7Rc4/3BR2xr8f9us+Jfqym/xe+45h1YYZxZWrteCa48NOGixuUJjJe
zb1MxNrTFZhPrKFT7pz9kCUZXl7DW5hzoQCH07CXZZI3B7kFS+5rjuEIB9qZXPE=
=cadl
-END PGP SIGNATURE-

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Mesos - spark task constraints

2015-04-02 Thread Ankur Chauhan
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Hi,

I am trying to figure out how to run spark jobs on a mesos cluster.
The mesos cluster has some nodes that have tachyon install on some
nodes and I would like the spark jobs to be started on only those
nodes. Each of these nodes have been configured with attributes
`tachyon:true`. Is there a way by which i can tell the spark
driver/scheduler to add this to the set of constraints while accepting
offers from the mesos master?

The main idea here is that I want to make sure that all my spark tasks
preferably run on tachyon enabled nodes (if available) and then prefer
other nodes. I realize that the prefer part may not be possible but
I atleast want to start with just getting them to run only on the
tachyon enabled nodes.

Also, if someone could give me a pointer to the mesos scheduler code
in spark that'll be great.

- -- Ankur Chauhan
-BEGIN PGP SIGNATURE-

iQEcBAEBAgAGBQJVHZzQAAoJEOSJAMhvLp3L6GgH/1yWTJ4VE+7T1ysQTcPlqYhE
ATUbT98+v1Y1qmEzhnz2STsJCm2tJhvivJGjee1xIgUzKlg5CfxQcBvQ/DmrOZXJ
c4fwx7d+ZoCkAqqLsqf6Pvi+31s/TL311bhqJYcR9msOuIoVFPjyMQEI9QDV+yrE
goNb684n+dKLDKJbThIou6gVCzlHw0z2FbJ7WGFMy3seYMH8RiF72Dm4pL2kmjCb
V9Zc8VQJAfONFpl7Lh/gUj7njkyx2su0NY5d5a+sS+FywzwQX5dLPs43T+/7UfIg
B1DhfkEsn7aNqauXh0WI13CKjKhoTaEnuFhllErl+9GzKinPQUNEZfCG5SnnXd4=
=VOPR
-END PGP SIGNATURE-

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



deployment of spark on mesos and data locality in tachyon/hdfs

2015-03-31 Thread Ankur Chauhan
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Hi,

I am fairly new to the spark ecosystem and I have been trying to setup
a spark on mesos deployment. I can't seem to figure out the best
practices around HDFS and Tachyon. The documentation about Spark's
data-locality section seems to point that each of my mesos slave nodes
should also run a hdfs datanode. This seems fine but I can't seem to
figure out how I would go about pushing tachyon into the mix.

How should i organize my cluster?
Should tachyon be colocated on my mesos worker nodes? or should all
the spark jobs reach out to a separate hdfs/tachyon cluster.

- -- Ankur Chauhan
-BEGIN PGP SIGNATURE-

iQEcBAEBAgAGBQJVGy4bAAoJEOSJAMhvLp3L5bkH/0MECyZkh3ptWzmsNnSNfGWp
Oh93TUfD+foXO2ya9D+hxuyAxbjfXs/68aCWZsUT6qdlBQU9T1vX+CmPOnpY1KPN
NJP3af+VK0osaFPo6k28OTql1iTnvb9Nq+WDlohxBC/hZtoYl4cVxu8JmRlou/nb
/wfpp0ShmJnlxsoPa6mVdwzjUjVQAfEpuet3Ow5veXeA9X7S55k/h0ZQrZtO8eXL
jJsKaT8ne9WZPhZwA4PkdzTxkXF3JNveCIKPzNttsJIaLlvd0nLA/wu6QWmxskp6
iliGSmEk5P1zZWPPnk+TPIqbA0Ttue7PeXpSrbA9+pYiNT4R/wAneMvmpTABuR4=
=8ijP
-END PGP SIGNATURE-

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Re: deployment of spark on mesos and data locality in tachyon/hdfs

2015-03-31 Thread Ankur Chauhan
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Hi Haoyuan,

So on each mesos slave node I should allocate/section off some amount
of memory for tachyon (let's say 50% of the total memory) and the rest
for regular mesos tasks?

This means, on each slave node I would have tachyon worker (+ hdfs
configuration to talk to s3 or the hdfs datanode) and the mesos slave
process. Is this correct?

On 31/03/2015 16:43, Haoyuan Li wrote:
 Tachyon should be co-located with Spark in this case.
 
 Best,
 
 Haoyuan
 
 On Tue, Mar 31, 2015 at 4:30 PM, Ankur Chauhan
 achau...@brightcove.com mailto:achau...@brightcove.com wrote:
 
 Hi,
 
 I am fairly new to the spark ecosystem and I have been trying to
 setup a spark on mesos deployment. I can't seem to figure out the
 best practices around HDFS and Tachyon. The documentation about
 Spark's data-locality section seems to point that each of my mesos
 slave nodes should also run a hdfs datanode. This seems fine but I
 can't seem to figure out how I would go about pushing tachyon into
 the mix.
 
 How should i organize my cluster? Should tachyon be colocated on my
 mesos worker nodes? or should all the spark jobs reach out to a
 separate hdfs/tachyon cluster.
 
 -- Ankur Chauhan
 
 -

 
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
 mailto:user-unsubscr...@spark.apache.org For additional commands,
 e-mail: user-h...@spark.apache.org 
 mailto:user-h...@spark.apache.org
 
 
 
 
 -- Haoyuan Li AMPLab, EECS, UC Berkeley 
 http://www.cs.berkeley.edu/~haoyuan/

- -- 
- -- Ankur Chauhan
-BEGIN PGP SIGNATURE-

iQEcBAEBAgAGBQJVGzKUAAoJEOSJAMhvLp3L3W4IAIVYiEKIZbC1a36/KWo94xYB
dvE4VXxF7z5FWmpuaHBEa+U1XWrR4cLVsQhocusOFn+oC7bstdltt3cGNAuwFSv6
Oogs4Sl1J4YZm8omKVdCkwD6Hv71HSntM8llz3qTW+Ljk2aKhfvNtp5nioQAm3e+
bs4ZKlCBij/xV3LbYYIePSS3lL0d9m1qEDJvi6jFcfm3gnBYeNeL9x92B5ylyth0
BGHnPN4sV/yopgrqOimLb12gSexHGNP1y6JBYy8NrHRY8SxkZ4sWKuyDnGDCOPOc
HC14Parf5Ly5FEz5g5WjF6HrXRdPlgr2ABxSLWOAB/siXsX9o/4yCy7NtDNcL6Y=
=f2xI
-END PGP SIGNATURE-

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Re: Mesos 0.22.0

2015-01-20 Thread Ankur Chauhan
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

+1

On 20/01/2015 20:00, Manivannan wrote:
 +1
 
 On Wed, Jan 21, 2015 at 7:04 AM, Vinod Kone vinodk...@gmail.com 
 mailto:vinodk...@gmail.com wrote:
 
 +1
 
 @vinodkone
 
 On Jan 20, 2015, at 12:03 PM, Chris Aniszczyk z...@twitter.com 
 mailto:z...@twitter.com wrote:
 
 definite +1, lets keep the release rhythm going!
 
 maybe some space on the wiki for release planning / release 
 managers would be a step forward
 
 On Tue, Jan 20, 2015 at 1:59 PM, Joe Stein joe.st...@stealth.ly 
 mailto:joe.st...@stealth.ly wrote:
 
 +1
 
 so excited for the persistence primitives, awesome!
 
 /*** Joe Stein Founder,
 Principal Consultant Big Data Open Source Security LLC 
 http://www.stealth.ly Twitter: @allthingshadoop 
 http://www.twitter.com/allthingshadoop 
 /
 
 On Tue, Jan 20, 2015 at 2:55 PM, John Pampuch j...@mesosphere.io
 mailto:j...@mesosphere.io wrote:
 
 +1!
 
 -John
 
 
 On Tue, Jan 20, 2015 at 11:52 AM, Niklas Nielsen 
 nik...@mesosphere.io mailto:nik...@mesosphere.io wrote:
 
 Hi all,
 
 We have been releasing major versions of Mesos roughly
 every second month
 (current average is ~66 days) and we are now 2 months
 after the 0.21.0
 release, so I would like to propose that we start
 planning for 0.22.0
 Not only in terms of timing, but also because we have
 some exciting
 features which are getting ready, including persistence
 primitives, modules
 and SSL support (I probably forgot a ton - please chime in).
 
 Since we are stakeholders in SSL and Modules, I would
 like to volunteer as
 release manager. Like in previous releases, I'd be happy to
 collaborate
 with co-release
 managers to make 0.22.0 a successful release.
 
 Niklas
 
 
 
 
 
 
 -- Cheers,
 
 Chris Aniszczyk | Open Source | Twitter, Inc. @cra | +1 512 961
 6719 tel:%2B1%20512%20961%206719
 
 
-BEGIN PGP SIGNATURE-

iQEcBAEBAgAGBQJUvyt8AAoJEOSJAMhvLp3LREwH/jsPJwSGr8ZGoi2JPv62rQsm
qwhGrmrfeKJEWijA6B8AnoUxKRVEdO65uiASpDfTVkPrrcQ+S93Pw75W6NX0oTvD
gwP8gSyDLIxAGTCkNkkxIVWueOD8Uca9s80k62dcC/LAq/2xQ9WuK+o0g2ArqVAb
035NboUM/oDYITZuauYSZCLyIRqSM6+FBpLXR0ZduEgQs5Ndq8O7FX95K+1wBmFs
yotqyQQXktlNTapMv1rCezLSYWMjX48HlKVBg3bMyamCbijcOLAk9VwH8nD+n8wA
V2JAEDt4gYorkVldOVJl+QquyWgImm7CKK43g8ZKjH5mmXKptAyFzEMh9FhloLg=
=xHRw
-END PGP SIGNATURE-


Re: [VOTE] Release Apache Mesos 0.21.1 (rc2)

2014-12-30 Thread Ankur Chauhan
+1

Sent from my iPhone

 On Dec 30, 2014, at 16:01, Tim Chen t...@mesosphere.io wrote:
 
 Hi all,
 
 Just a reminder the vote is up for another 2 hours, let me know if any of you 
 have any objections.
 
 Thanks,
 
 Tim
 
 On Mon, Dec 29, 2014 at 5:32 AM, Niklas Nielsen nik...@mesosphere.io wrote:
 +1, Compiled and tested on Ubuntu Trusty, CentOS Linux 7 and Mac OS X
 
 Thanks guys!
 Niklas
 
 
 On 19 December 2014 at 22:02, Tim Chen t...@mesosphere.io wrote:
 Hi Ankur,
 
 Since MESOS-1711 is just a minor improvement I'm inclined to include it for 
 the next major release which shouldn't be too far away from this release.
 
 If anyone else thinks otherwise please let me know.
 
 Tim
 
 On Fri, Dec 19, 2014 at 12:44 PM, Ankur Chauhan an...@malloc64.com wrote:
 Sorry for a late join in can we get 
 https://issues.apache.org/jira/plugins/servlet/mobile#issue/MESOS-1711 in 
 too or is it too late?
 -- ankur 
 Sent from my iPhone
 
 On Dec 19, 2014, at 12:23, Tim Chen t...@mesosphere.io wrote:
 
 Hi all,
 
 Please vote on releasing the following candidate as Apache Mesos 0.21.1.
 
 
 0.21.1 includes the following:
 
 * This is a bug fix release.
 
 ** Bug
   * [MESOS-2047] Isolator cleanup failures shouldn't cause TASK_LOST.
   * [MESOS-2071] Libprocess generates invalid HTTP
   * [MESOS-2147] Large number of connections slows statistics.json 
 responses.
   * [MESOS-2182] Performance issue in libprocess SocketManager.
 
 ** Improvement
   * [MESOS-1925] Docker kill does not allow containers to exit gracefully
   * [MESOS-2113] Improve configure to find apr and svn libraries/headers 
 in OSX
 
 The CHANGELOG for the release is available at:
 https://git-wip-us.apache.org/repos/asf?p=mesos.git;a=blob_plain;f=CHANGELOG;hb=0.21.1-rc2
 
 
 The candidate for Mesos 0.21.1 release is available at:
 https://dist.apache.org/repos/dist/dev/mesos/0.21.1-rc2/mesos-0.21.1.tar.gz
 
 The tag to be voted on is 0.21.1-rc2:
 https://git-wip-us.apache.org/repos/asf?p=mesos.git;a=commit;h=0.21.1-rc2
 
 The MD5 checksum of the tarball can be found at:
 https://dist.apache.org/repos/dist/dev/mesos/0.21.1-rc2/mesos-0.21.1.tar.gz.md5
 
 The signature of the tarball can be found at:
 https://dist.apache.org/repos/dist/dev/mesos/0.21.1-rc2/mesos-0.21.1.tar.gz.asc
 
 The PGP key used to sign the release is here:
 https://dist.apache.org/repos/dist/release/mesos/KEYS
 
 The JAR is up in Maven in a staging repository here:
 https://repository.apache.org/content/repositories/orgapachemesos-1046
 
 Please vote on releasing this package as Apache Mesos 0.21.1!
 
 The vote is open until Tue Dec 23 18:00:00 PST 2014 and passes if a 
 majority of at least 3 +1 PMC votes are cast.
 
 [ ] +1 Release this package as Apache Mesos 0.21.1
 [ ] -1 Do not release this package because ...
 
 Thanks,
 
 Tim  Till
 


Re: Review Request 27670: Bug fix: Check for non-zero status code and hadoop client not found

2014-12-01 Thread Ankur Chauhan


 On Nov. 8, 2014, 8:14 a.m., Mesos ReviewBot wrote:
  Patch looks great!
  
  Reviews applied: [27670]
  
  All tests passed.
 
 Ankur Chauhan wrote:
 Bump: Is there any more review suggestion that need to be incorporated to 
 get this patch in?
 
 Vinod Kone wrote:
 i was waiting for @benh to ship.

I don't know if there is a feature to 'ping' benh. Maybe this just got lost in 
the flurry of emails.


- Ankur


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/27670/#review60481
---


On Nov. 8, 2014, 7:28 a.m., Ankur Chauhan wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/27670/
 ---
 
 (Updated Nov. 8, 2014, 7:28 a.m.)
 
 
 Review request for mesos, Benjamin Hindman and Vinod Kone.
 
 
 Bugs: MESOS-1711
 https://issues.apache.org/jira/browse/MESOS-1711
 
 
 Repository: mesos-git
 
 
 Description
 ---
 
 @benh noticed that the fetcher would detect the condition where hadoop 
 returns a success in case of a success non zero value
 
 
 Diffs
 -
 
   src/launcher/fetcher.cpp 400fadf94d35721cabaa9983b12a5d35f71f5b5b 
 
 Diff: https://reviews.apache.org/r/27670/diff/
 
 
 Testing
 ---
 
 make check
 sudo bin/mesos-tests.sh --verbose
 support/mesos-style.py
 
 
 Thanks,
 
 Ankur Chauhan
 




Re: Help needed

2014-11-22 Thread Ankur Chauhan
You could use HDFS of Mount NFs on all nodes and then mount that inside all the 
containers as a volume. 

Sent from my iPhone

 On Nov 22, 2014, at 9:40 AM, Qiang qjavaswing2...@gmail.com wrote:
 
 I having been working with docker and mesos recently and one of the app I am 
 going to dockerize relies on file storage, I thought about using NFS, and 
 docker data volume container, but I don't know how can I possibly use these 
 to address my problem, as far as I know, mesos has service discovery but in 
 my case I don't think a file storage can be made a service somehow.
 
 Any idea to save my day?
 
 Thanks,
 
 -- 
 Qiang Han


Re: Review Request 27670: Bug fix: Check for non-zero status code and hadoop client not found

2014-11-18 Thread Ankur Chauhan


 On Nov. 8, 2014, 8:14 a.m., Mesos ReviewBot wrote:
  Patch looks great!
  
  Reviews applied: [27670]
  
  All tests passed.

Bump: Is there any more review suggestion that need to be incorporated to get 
this patch in?


- Ankur


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/27670/#review60481
---


On Nov. 8, 2014, 7:28 a.m., Ankur Chauhan wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/27670/
 ---
 
 (Updated Nov. 8, 2014, 7:28 a.m.)
 
 
 Review request for mesos, Benjamin Hindman and Vinod Kone.
 
 
 Bugs: MESOS-1711
 https://issues.apache.org/jira/browse/MESOS-1711
 
 
 Repository: mesos-git
 
 
 Description
 ---
 
 @benh noticed that the fetcher would detect the condition where hadoop 
 returns a success in case of a success non zero value
 
 
 Diffs
 -
 
   src/launcher/fetcher.cpp 400fadf94d35721cabaa9983b12a5d35f71f5b5b 
 
 Diff: https://reviews.apache.org/r/27670/diff/
 
 
 Testing
 ---
 
 make check
 sudo bin/mesos-tests.sh --verbose
 support/mesos-style.py
 
 
 Thanks,
 
 Ankur Chauhan
 




Re: Relative dates are parsed incorrectly.

2014-11-12 Thread Ankur Chauhan
Please have a look at 
http://mesos.apache.org/documentation/latest/mesos-developers-guide/
That should help you get setup/started. Github pull requests are not the 
suggested medium for contributing changes, you will need to submit patches to 
review board. 

Ankur 
Sent from my iPhone

 On Nov 12, 2014, at 8:21 AM, Leigh Martell le...@immun.io wrote:
 
 I have never done such a thing for this project but the github repo is here 
 https://github.com/apache/mesos I would even encourage you to do so. I have 
 not looked but there should be some contribution guideline some where.
 
 Take Care!
 
 -Leigh
 
 On Wed, Nov 12, 2014 at 11:03 AM, Adam Shannon adam.shan...@banno.com 
 wrote:
 Hi all,
 
 This morning I noticed that the webui is parsing dates incorrectly. It 
 ignores the timezone sent along the iso 8601 format. I've opened an issue 
 here:
 
 https://issues.apache.org/jira/browse/MESOS-2088
 
 My question is, would I be able to open a PR and get this fixed?
 
 Thanks.
 
 -- 
 Adam Shannon | Software Engineer | Banno | ProfitStars®
 206 6th Ave Suite 1020 | Des Moines, IA 50309 | Cell: 515.867.8337
 


Re: [ANN] Mesos resources searchable

2014-11-10 Thread Ankur Chauhan
WOW! That is amazing. All these things in a single place is awesome.

-- Ankur

 On 10 Nov 2014, at 15:41, Otis Gospodnetic otis.gospodne...@gmail.com wrote:
 
 Hi everyone,
 
 Quick announcement that we've added Apache Mesos and made all its resources 
 searchable:
 
 http://search-hadoop.com/mesos http://search-hadoop.com/mesos
 
 This let's you search all of the following in one go:
 * user  dev mailing list
 * JIRA issues
 * source code
 * java docs
 * web site
 
 Enjoy!
 
 Otis
 --
 Monitoring * Alerting * Anomaly Detection * Centralized Log Management
 Solr  Elasticsearch Support * http://sematext.com/ http://sematext.com/
 



Re: [ANN] Mesos resources searchable

2014-11-10 Thread Ankur Chauhan
WOW! That is amazing. All these things in a single place is awesome.

-- Ankur

 On 10 Nov 2014, at 15:41, Otis Gospodnetic otis.gospodne...@gmail.com wrote:
 
 Hi everyone,
 
 Quick announcement that we've added Apache Mesos and made all its resources 
 searchable:
 
 http://search-hadoop.com/mesos http://search-hadoop.com/mesos
 
 This let's you search all of the following in one go:
 * user  dev mailing list
 * JIRA issues
 * source code
 * java docs
 * web site
 
 Enjoy!
 
 Otis
 --
 Monitoring * Alerting * Anomaly Detection * Centralized Log Management
 Solr  Elasticsearch Support * http://sematext.com/ http://sematext.com/
 



Re: Review Request 27670: Bug fix: Check for non-zero status code and hadoop client not found

2014-11-07 Thread Ankur Chauhan

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/27670/
---

(Updated Nov. 8, 2014, 7:26 a.m.)


Review request for mesos, Benjamin Hindman and Vinod Kone.


Bugs: MESOS-1711
https://issues.apache.org/jira/browse/MESOS-1711


Repository: mesos-git


Description
---

@benh noticed that the fetcher would detect the condition where hadoop returns 
a success in case of a success non zero value


Diffs (updated)
-

  src/launcher/fetcher.cpp 400fadf94d35721cabaa9983b12a5d35f71f5b5b 

Diff: https://reviews.apache.org/r/27670/diff/


Testing
---

make check
sudo bin/mesos-tests.sh --verbose
support/mesos-style.py


Thanks,

Ankur Chauhan



Re: Review Request 27670: Bug fix: Check for non-zero status code and hadoop client not found

2014-11-07 Thread Ankur Chauhan

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/27670/
---

(Updated Nov. 8, 2014, 7:28 a.m.)


Review request for mesos, Benjamin Hindman and Vinod Kone.


Changes
---

Simplify if expression


Bugs: MESOS-1711
https://issues.apache.org/jira/browse/MESOS-1711


Repository: mesos-git


Description
---

@benh noticed that the fetcher would detect the condition where hadoop returns 
a success in case of a success non zero value


Diffs (updated)
-

  src/launcher/fetcher.cpp 400fadf94d35721cabaa9983b12a5d35f71f5b5b 

Diff: https://reviews.apache.org/r/27670/diff/


Testing
---

make check
sudo bin/mesos-tests.sh --verbose
support/mesos-style.py


Thanks,

Ankur Chauhan



Re: Review Request 27670: Bug fix: Check for non-zero status code and hadoop client not found

2014-11-06 Thread Ankur Chauhan

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/27670/
---

(Updated Nov. 6, 2014, 7:03 p.m.)


Review request for mesos, Benjamin Hindman and Vinod Kone.


Changes
---

Added MESOS-1711 as the 'bugs' field.


Bugs: MESOS-1711
https://issues.apache.org/jira/browse/MESOS-1711


Repository: mesos-git


Description
---

@benh noticed that the fetcher would detect the condition where hadoop returns 
a success in case of a success non zero value


Diffs
-

  src/launcher/fetcher.cpp bd95928bc3191970330e839bcf41e343d5142c54 

Diff: https://reviews.apache.org/r/27670/diff/


Testing
---

make check
sudo bin/mesos-tests.sh --verbose
support/mesos-style.py


Thanks,

Ankur Chauhan



Re: [VOTE] Release Apache Mesos 0.21.0 (rc1)

2014-11-06 Thread Ankur Chauhan
Hi Tim,

FYI: I posted a fix for the bug pointed out by @benh at 
https://reviews.apache.org/r/27670/ https://reviews.apache.org/r/27670/

-- Ankur
 On 6 Nov 2014, at 15:31, Timothy Chen tnac...@gmail.com wrote:
 
 The Fetcher patch currently contains a bug that is going to be patched
 up by a new reviewboard, so I suggest we leave it for the next
 version.
 
 Tim
 
 On Thu, Nov 6, 2014 at 2:29 PM, Adam Bordelon a...@mesosphere.io wrote:
 Also note, in addition to 'Support for Mesos modules' we also now allow
 pluggable isolator modules.
 
 On Thu, Nov 6, 2014 at 2:28 PM, Adam Bordelon a...@mesosphere.io wrote:
 
 make check passed on LinuxMint16 (Ubuntu 13.10)
 
 Would love to see the following two cherry-picked, especially the
 configuration.md change before you update the website for the release
 blog post.
 
 d09478d73cb3ddd76c7b886bef691c8034fa6908 Updated docs/configuration.md.
 c7227471f98c0dc62c8700d41534be142a3fcfad Fetcher uses Hadoop to fetch URIs
 with unknown schemes.
 
 
 On Thu, Nov 6, 2014 at 1:00 PM, Tom Arnfeld t...@duedil.com wrote:
 
 +1
 
 
 
 
 `make check` passed on Ubuntu 12.04 LTS (kernel 3.2.0-67)
 
 
 --
 
 
 Tom Arnfeld
 
 Developer // DueDil
 
 
 
 
 
 (+44) 7525940046
 
 25 Christopher Street, London, EC2A 2BS
 
 On Thu, Nov 6, 2014 at 8:43 PM, Ian Downes idow...@twitter.com.invalid
 wrote:
 
 Apologies: I used support/tag.sh but had a local branch *and* local tag
 and
 it pushed the branch only.
 $ git ls-remote --tags origin-wip | grep 0.21.0
 a7733493dc9e6f2447f825671d8a745602c9bf7a refs/tags/0.21.0-rc1
 On Thu, Nov 6, 2014 at 8:11 AM, Tim St Clair tstcl...@redhat.com
 wrote:
 $ git tag -l | grep 21
 
 $ git branch -r
  origin/0.21.0-rc1
 
 It looks like you created a branch vs. tag ...?
 
 Cheers,
 Tim
 
 - Original Message -
 From: Ian Downes ian.dow...@gmail.com
 To: dev@mesos.apache.org, u...@mesos.apache.org
 Sent: Wednesday, November 5, 2014 5:12:52 PM
 Subject: [VOTE] Release Apache Mesos 0.21.0 (rc1)
 
 Hi all,
 
 Please vote on releasing the following candidate as Apache Mesos
 0.21.0.
 
 
 0.21.0 includes the following:
 
 
 
 State reconciliation for frameworks
 Support for Mesos modules
 Task status now includes source and reason
 A shared filesystem isolator
 A pid namespace isolator
 
 The CHANGELOG for the release is available at:
 
 
 https://git-wip-us.apache.org/repos/asf?p=mesos.git;a=blob_plain;f=CHANGELOG;hb=0.21.0-rc1
 
 
 
 
 The candidate for Mesos 0.21.0 release is available at:
 
 
 https://dist.apache.org/repos/dist/dev/mesos/0.21.0-rc1/mesos-0.21.0.tar.gz
 
 The tag to be voted on is 0.21.0-rc1:
 
 
 https://git-wip-us.apache.org/repos/asf?p=mesos.git;a=commit;h=0.21.0-rc1
 
 The MD5 checksum of the tarball can be found at:
 
 
 https://dist.apache.org/repos/dist/dev/mesos/0.21.0-rc1/mesos-0.21.0.tar.gz.md5
 
 The signature of the tarball can be found at:
 
 
 https://dist.apache.org/repos/dist/dev/mesos/0.21.0-rc1/mesos-0.21.0.tar.gz.asc
 
 The PGP key used to sign the release is here:
 https://dist.apache.org/repos/dist/release/mesos/KEYS
 
 The JAR is up in Maven in a staging repository here:
 
 https://repository.apache.org/content/repositories/orgapachemesos-1038
 
 Please vote on releasing this package as Apache Mesos 0.21.0!
 
 The vote is open until Sat Nov  8 15:09:48 PST 2014 and passes if a
 majority of at least 3 +1 PMC votes are cast.
 
 [ ] +1 Release this package as Apache Mesos 0.21.0
 [ ] -1 Do not release this package because ...
 
 Thanks,
 
 Ian Downes
 
 
 --
 Cheers,
 Timothy St. Clair
 Red Hat Inc.
 
 
 
 



Logging warning in mesos fetcher

2014-11-05 Thread Ankur Chauhan
I noticed in a run of bin/mesos-tests.sh --gtest_filter=FetcherTest.*:

$ ./bin/mesos-tests.sh --gtest_filter=FetcherTest.*
Source directory: /Users/achauhan/Projects/mesos
Build directory: /Users/achauhan/Projects/mesos/build
-
We cannot run any Docker tests because:
Docker tests not supported on non-Linux systems
-
Note: Google Test filter = 
FetcherTest.*-DockerContainerizerTest.ROOT_DOCKER_Launch:DockerContainerizerTest.ROOT_DOCKER_Kill:DockerContainerizerTest.ROOT_DOCKER_Usage:DockerContainerizerTest.DISABLED_ROOT_DOCKER_Recover:DockerContainerizerTest.ROOT_DOCKER_Logs:DockerContainerizerTest.ROOT_DOCKER_Default_CMD:DockerContainerizerTest.ROOT_DOCKER_Default_CMD_Override:DockerContainerizerTest.ROOT_DOCKER_Default_CMD_Args:DockerContainerizerTest.ROOT_DOCKER_SlaveRecoveryTaskContainer:DockerContainerizerTest.DISABLED_ROOT_DOCKER_SlaveRecoveryExecutorContainer:DockerContainerizerTest.ROOT_DOCKER_PortMapping:DockerContainerizerTest.ROOT_DOCKER_LaunchSandboxWithColon:DockerTest.ROOT_DOCKER_interface:DockerTest.ROOT_DOCKER_CheckCommandWithShell:DockerTest.ROOT_DOCKER_CheckPortResource:DockerTest.ROOT_DOCKER_CancelPull:SlaveTest.ROOT_RunTaskWithCommandInfoWithoutUser:SlaveTest.DISABLED_ROOT_RunTaskWithCommandInfoWithUser:SlaveCount/Registrar_BENCHMARK_Test.performance/0:SlaveCount/Registrar_BENCHMARK_Test.performance/1:SlaveCount/Registrar_BENCHMARK_Test.performance/2:SlaveCount/Registrar_BENCHMARK_Test.performance/3
[==] Running 4 tests from 1 test case.
[--] Global test environment set-up.
[--] 4 tests from FetcherTest
[ RUN  ] FetcherTest.FileURI
WARNING: Logging before InitGoogleLogging() is written to STDERR
I1104 22:43:38.412901 2068996864 fetcher.cpp:197] Fetching URI 
'file:///private/tmp/FetcherTest_FileURI_ipAYlv/from/test'
I1104 22:43:38.413653 2068996864 fetcher.cpp:178] Copying resource from 
'/private/tmp/FetcherTest_FileURI_ipAYlv/from/test' to 
'/private/tmp/FetcherTest_FileURI_ipAYlv'
I1104 22:43:38.419317 2068996864 fetcher.cpp:300] Skipped extracting path 
'/private/tmp/FetcherTest_FileURI_ipAYlv/test'
[   OK ] FetcherTest.FileURI (110 ms)
[ RUN  ] FetcherTest.FilePath
WARNING: Logging before InitGoogleLogging() is written to STDERR
I1104 22:43:38.507416 2068996864 fetcher.cpp:197] Fetching URI 
'/private/tmp/FetcherTest_FilePath_lAzOZG/from/test'
I1104 22:43:38.508020 2068996864 fetcher.cpp:178] Copying resource from 
'/private/tmp/FetcherTest_FilePath_lAzOZG/from/test' to 
'/private/tmp/FetcherTest_FilePath_lAzOZG'
I1104 22:43:38.512864 2068996864 fetcher.cpp:300] Skipped extracting path 
'/private/tmp/FetcherTest_FilePath_lAzOZG/test'
[   OK ] FetcherTest.FilePath (100 ms)
[ RUN  ] FetcherTest.OSNetUriTest
WARNING: Logging before InitGoogleLogging() is written to STDERR
I1104 22:43:38.608177 2068996864 fetcher.cpp:197] Fetching URI 
'http://www.example.com/index.html'
I1104 22:43:38.608901 2068996864 fetcher.cpp:109] Fetching URI 
'http://www.example.com/index.html' with os::net
I1104 22:43:38.608924 2068996864 fetcher.cpp:119] Downloading 
'http://www.example.com/index.html' to 
'/private/tmp/FetcherTest_OSNetUriTest_ZB4nAg/index.html'
I1104 22:43:38.619645 2068996864 fetcher.cpp:300] Skipped extracting path 
'/private/tmp/FetcherTest_OSNetUriTest_ZB4nAg/index.html'
[   OK ] FetcherTest.OSNetUriTest (105 ms)
[ RUN  ] FetcherTest.FileLocalhostURI
WARNING: Logging before InitGoogleLogging() is written to STDERR
I1104 22:43:38.712461 2068996864 fetcher.cpp:197] Fetching URI 
'file://localhost/private/tmp/FetcherTest_FileLocalhostURI_m1wYnH/from/test'
I1104 22:43:38.713353 2068996864 fetcher.cpp:178] Copying resource from 
'/private/tmp/FetcherTest_FileLocalhostURI_m1wYnH/from/test' to 
'/private/tmp/FetcherTest_FileLocalhostURI_m1wYnH'
I1104 22:43:38.718541 2068996864 fetcher.cpp:300] Skipped extracting path 
'/private/tmp/FetcherTest_FileLocalhostURI_m1wYnH/test'
[   OK ] FetcherTest.FileLocalhostURI (103 ms)
[--] 4 tests from FetcherTest (419 ms total)
 
[--] Global test environment tear-down
[==] 4 tests from 1 test case ran. (434 ms total)
[  PASSED  ] 4 tests.
 
  YOU HAVE 5 DISABLED TESTS

There are warnings in the output:

WARNING: Logging before InitGoogleLogging() is written to STDERR
And looking at the code, I don't see a glog initialization codeblock. Should 
there be google::InitGoogleLogging(argv[0]); in the main method? I also cannot 
figure out how logging is configured in general and/or where mesos-fetcher is 
invoked. Can someone chime in?


-- Ankur


Re: Logging warning in mesos fetcher

2014-11-05 Thread Ankur Chauhan
Will do!

-- Ankur

 On 5 Nov 2014, at 10:56, Vinod Kone vinodk...@gmail.com wrote:
 
 Good catch. You want to add the log initialization to the main() in
 fetcher.cpp. See master/main.cpp or slave/main.cpp for examples.
 
 On Wed, Nov 5, 2014 at 10:32 AM, Ankur Chauhan an...@malloc64.com wrote:
 
 I noticed in a run of bin/mesos-tests.sh --gtest_filter=FetcherTest.*:
 
 $ ./bin/mesos-tests.sh --gtest_filter=FetcherTest.*
 Source directory: /Users/achauhan/Projects/mesos
 Build directory: /Users/achauhan/Projects/mesos/build
 -
 We cannot run any Docker tests because:
 Docker tests not supported on non-Linux systems
 -
 Note: Google Test filter =
 FetcherTest.*-DockerContainerizerTest.ROOT_DOCKER_Launch:DockerContainerizerTest.ROOT_DOCKER_Kill:DockerContainerizerTest.ROOT_DOCKER_Usage:DockerContainerizerTest.DISABLED_ROOT_DOCKER_Recover:DockerContainerizerTest.ROOT_DOCKER_Logs:DockerContainerizerTest.ROOT_DOCKER_Default_CMD:DockerContainerizerTest.ROOT_DOCKER_Default_CMD_Override:DockerContainerizerTest.ROOT_DOCKER_Default_CMD_Args:DockerContainerizerTest.ROOT_DOCKER_SlaveRecoveryTaskContainer:DockerContainerizerTest.DISABLED_ROOT_DOCKER_SlaveRecoveryExecutorContainer:DockerContainerizerTest.ROOT_DOCKER_PortMapping:DockerContainerizerTest.ROOT_DOCKER_LaunchSandboxWithColon:DockerTest.ROOT_DOCKER_interface:DockerTest.ROOT_DOCKER_CheckCommandWithShell:DockerTest.ROOT_DOCKER_CheckPortResource:DockerTest.ROOT_DOCKER_CancelPull:SlaveTest.ROOT_RunTaskWithCommandInfoWithoutUser:SlaveTest.DISABLED_ROOT_RunTaskWithCommandInfoWithUser:SlaveCount/Registrar_BENCHMARK_Test.performance/0:SlaveCount/Registrar_BENCHMARK_Test.performance/1:SlaveCount/Registrar_BENCHMARK_Test.performance/2:SlaveCount/Registrar_BENCHMARK_Test.performance/3
 [==] Running 4 tests from 1 test case.
 [--] Global test environment set-up.
 [--] 4 tests from FetcherTest
 [ RUN  ] FetcherTest.FileURI
 WARNING: Logging before InitGoogleLogging() is written to STDERR
 I1104 22:43:38.412901 2068996864 fetcher.cpp:197] Fetching URI
 'file:///private/tmp/FetcherTest_FileURI_ipAYlv/from/test'
 I1104 22:43:38.413653 2068996864 fetcher.cpp:178] Copying resource from
 '/private/tmp/FetcherTest_FileURI_ipAYlv/from/test' to
 '/private/tmp/FetcherTest_FileURI_ipAYlv'
 I1104 22:43:38.419317 2068996864 fetcher.cpp:300] Skipped extracting path
 '/private/tmp/FetcherTest_FileURI_ipAYlv/test'
 [   OK ] FetcherTest.FileURI (110 ms)
 [ RUN  ] FetcherTest.FilePath
 WARNING: Logging before InitGoogleLogging() is written to STDERR
 I1104 22:43:38.507416 2068996864 fetcher.cpp:197] Fetching URI
 '/private/tmp/FetcherTest_FilePath_lAzOZG/from/test'
 I1104 22:43:38.508020 2068996864 fetcher.cpp:178] Copying resource from
 '/private/tmp/FetcherTest_FilePath_lAzOZG/from/test' to
 '/private/tmp/FetcherTest_FilePath_lAzOZG'
 I1104 22:43:38.512864 2068996864 fetcher.cpp:300] Skipped extracting path
 '/private/tmp/FetcherTest_FilePath_lAzOZG/test'
 [   OK ] FetcherTest.FilePath (100 ms)
 [ RUN  ] FetcherTest.OSNetUriTest
 WARNING: Logging before InitGoogleLogging() is written to STDERR
 I1104 22:43:38.608177 2068996864 fetcher.cpp:197] Fetching URI '
 http://www.example.com/index.html'
 I1104 22:43:38.608901 2068996864 fetcher.cpp:109] Fetching URI '
 http://www.example.com/index.html' with os::net
 I1104 22:43:38.608924 2068996864 fetcher.cpp:119] Downloading '
 http://www.example.com/index.html' to
 '/private/tmp/FetcherTest_OSNetUriTest_ZB4nAg/index.html'
 I1104 22:43:38.619645 2068996864 fetcher.cpp:300] Skipped extracting path
 '/private/tmp/FetcherTest_OSNetUriTest_ZB4nAg/index.html'
 [   OK ] FetcherTest.OSNetUriTest (105 ms)
 [ RUN  ] FetcherTest.FileLocalhostURI
 WARNING: Logging before InitGoogleLogging() is written to STDERR
 I1104 22:43:38.712461 2068996864 fetcher.cpp:197] Fetching URI
 'file://localhost/private/tmp/FetcherTest_FileLocalhostURI_m1wYnH/from/test'
 I1104 22:43:38.713353 2068996864 fetcher.cpp:178] Copying resource from
 '/private/tmp/FetcherTest_FileLocalhostURI_m1wYnH/from/test' to
 '/private/tmp/FetcherTest_FileLocalhostURI_m1wYnH'
 I1104 22:43:38.718541 2068996864 fetcher.cpp:300] Skipped extracting path
 '/private/tmp/FetcherTest_FileLocalhostURI_m1wYnH/test'
 [   OK ] FetcherTest.FileLocalhostURI (103 ms)
 [--] 4 tests from FetcherTest (419 ms total)
 
 [--] Global test environment tear-down
 [==] 4 tests from 1 test case ran. (434 ms total)
 [  PASSED  ] 4 tests.
 
  YOU HAVE 5 DISABLED TESTS
 
 There are warnings in the output:
 
 WARNING: Logging before InitGoogleLogging() is written to STDERR
 And looking at the code, I don't see a glog initialization codeblock.
 Should there be google::InitGoogleLogging(argv[0]); in the main method? I
 also cannot figure out how logging is configured in general and/or where
 mesos-fetcher is invoked. Can someone

Re: Review Request 27483: Fetcher uses hadoop to fetch URIs regardless of the url scheme.

2014-11-05 Thread Ankur Chauhan

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/27483/
---

(Updated Nov. 5, 2014, 7:57 p.m.)


Review request for mesos, Timothy Chen, Timothy St. Clair, and Vinod Kone.


Changes
---

Initialize logging and add usual flags handling (same as per other parts of 
mesos), fix comments


Bugs: MESOS-1711
https://issues.apache.org/jira/browse/MESOS-1711


Repository: mesos-git


Description
---

Previously, the fetcher used a hardcoded list of schemes to determine what URIs 
could be fetched by hadoop (if available). This is now changed such that we 
first check if hadoop can fetch them for us and then we fallback to the os::net 
and then a local copy method (same as it used to be). This allows users to 
fetch artifacts from arbitrary filesystems as long as hadoop is correctly 
configured (in core-site.xml).


Diffs (updated)
-

  src/hdfs/hdfs.hpp bbfeddef106c598d8379ced085ef0605c4b2f380 
  src/launcher/fetcher.cpp 9323c28237010fa065ef34d74435c151ded530a8 
  src/tests/fetcher_tests.cpp d7754009a59fedb43e3422c56b3a786ce80164aa 

Diff: https://reviews.apache.org/r/27483/diff/


Testing
---

make check
sudo bin/mesos-tests.sh --verbose
support/mesos-style.py


Thanks,

Ankur Chauhan



Re: Review Request 27483: Fetcher uses hadoop to fetch URIs regardless of the url scheme.

2014-11-05 Thread Ankur Chauhan

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/27483/#review60020
---



src/launcher/fetcher.cpp
https://reviews.apache.org/r/27483/#comment101354

This change makes fetcher in line with slave/main.cpp or master/main.cpp 
and initialized the logging.


- Ankur Chauhan


On Nov. 5, 2014, 7:57 p.m., Ankur Chauhan wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/27483/
 ---
 
 (Updated Nov. 5, 2014, 7:57 p.m.)
 
 
 Review request for mesos, Timothy Chen, Timothy St. Clair, and Vinod Kone.
 
 
 Bugs: MESOS-1711
 https://issues.apache.org/jira/browse/MESOS-1711
 
 
 Repository: mesos-git
 
 
 Description
 ---
 
 Previously, the fetcher used a hardcoded list of schemes to determine what 
 URIs could be fetched by hadoop (if available). This is now changed such that 
 we first check if hadoop can fetch them for us and then we fallback to the 
 os::net and then a local copy method (same as it used to be). This allows 
 users to fetch artifacts from arbitrary filesystems as long as hadoop is 
 correctly configured (in core-site.xml).
 
 
 Diffs
 -
 
   src/hdfs/hdfs.hpp bbfeddef106c598d8379ced085ef0605c4b2f380 
   src/launcher/fetcher.cpp 9323c28237010fa065ef34d74435c151ded530a8 
   src/tests/fetcher_tests.cpp d7754009a59fedb43e3422c56b3a786ce80164aa 
 
 Diff: https://reviews.apache.org/r/27483/diff/
 
 
 Testing
 ---
 
 make check
 sudo bin/mesos-tests.sh --verbose
 support/mesos-style.py
 
 
 Thanks,
 
 Ankur Chauhan
 




Re: Review Request 27483: Fetcher uses hadoop to fetch URIs regardless of the url scheme.

2014-11-05 Thread Ankur Chauhan

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/27483/
---

(Updated Nov. 5, 2014, 9:24 p.m.)


Review request for mesos, Timothy Chen, Timothy St. Clair, and Vinod Kone.


Changes
---

Add locally hosted http server for os::net tests


Bugs: MESOS-1711
https://issues.apache.org/jira/browse/MESOS-1711


Repository: mesos-git


Description
---

Previously, the fetcher used a hardcoded list of schemes to determine what URIs 
could be fetched by hadoop (if available). This is now changed such that we 
first check if hadoop can fetch them for us and then we fallback to the os::net 
and then a local copy method (same as it used to be). This allows users to 
fetch artifacts from arbitrary filesystems as long as hadoop is correctly 
configured (in core-site.xml).


Diffs (updated)
-

  src/hdfs/hdfs.hpp bbfeddef106c598d8379ced085ef0605c4b2f380 
  src/launcher/fetcher.cpp 9323c28237010fa065ef34d74435c151ded530a8 
  src/tests/fetcher_tests.cpp d7754009a59fedb43e3422c56b3a786ce80164aa 

Diff: https://reviews.apache.org/r/27483/diff/


Testing
---

make check
sudo bin/mesos-tests.sh --verbose
support/mesos-style.py


Thanks,

Ankur Chauhan



Re: Review Request 27483: Fetcher uses hadoop to fetch URIs regardless of the url scheme.

2014-11-05 Thread Ankur Chauhan

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/27483/
---

(Updated Nov. 5, 2014, 10:07 p.m.)


Review request for mesos, Timothy Chen, Timothy St. Clair, and Vinod Kone.


Changes
---

Minor style fixes and removed unused code


Bugs: MESOS-1711
https://issues.apache.org/jira/browse/MESOS-1711


Repository: mesos-git


Description
---

Previously, the fetcher used a hardcoded list of schemes to determine what URIs 
could be fetched by hadoop (if available). This is now changed such that we 
first check if hadoop can fetch them for us and then we fallback to the os::net 
and then a local copy method (same as it used to be). This allows users to 
fetch artifacts from arbitrary filesystems as long as hadoop is correctly 
configured (in core-site.xml).


Diffs (updated)
-

  src/hdfs/hdfs.hpp bbfeddef106c598d8379ced085ef0605c4b2f380 
  src/launcher/fetcher.cpp 9323c28237010fa065ef34d74435c151ded530a8 
  src/tests/fetcher_tests.cpp d7754009a59fedb43e3422c56b3a786ce80164aa 

Diff: https://reviews.apache.org/r/27483/diff/


Testing
---

make check
sudo bin/mesos-tests.sh --verbose
support/mesos-style.py


Thanks,

Ankur Chauhan



Re: Review Request 27483: Fetcher uses hadoop to fetch URIs regardless of the url scheme.

2014-11-05 Thread Ankur Chauhan


 On Nov. 5, 2014, 10:12 p.m., Vinod Kone wrote:
  Thanks! I'll get this committed once 0.21.0 is cut.

Thanks! And really appreciate all the input, help and hand holding. It's been a 
long time since I touched C++. Future patches would probably require less 
handholding.

Thanks everyone.. @vinod, @tnachen, @tstclair


- Ankur


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/27483/#review60049
---


On Nov. 5, 2014, 10:07 p.m., Ankur Chauhan wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/27483/
 ---
 
 (Updated Nov. 5, 2014, 10:07 p.m.)
 
 
 Review request for mesos, Timothy Chen, Timothy St. Clair, and Vinod Kone.
 
 
 Bugs: MESOS-1711
 https://issues.apache.org/jira/browse/MESOS-1711
 
 
 Repository: mesos-git
 
 
 Description
 ---
 
 Previously, the fetcher used a hardcoded list of schemes to determine what 
 URIs could be fetched by hadoop (if available). This is now changed such that 
 we first check if hadoop can fetch them for us and then we fallback to the 
 os::net and then a local copy method (same as it used to be). This allows 
 users to fetch artifacts from arbitrary filesystems as long as hadoop is 
 correctly configured (in core-site.xml).
 
 
 Diffs
 -
 
   src/hdfs/hdfs.hpp bbfeddef106c598d8379ced085ef0605c4b2f380 
   src/launcher/fetcher.cpp 9323c28237010fa065ef34d74435c151ded530a8 
   src/tests/fetcher_tests.cpp d7754009a59fedb43e3422c56b3a786ce80164aa 
 
 Diff: https://reviews.apache.org/r/27483/diff/
 
 
 Testing
 ---
 
 make check
 sudo bin/mesos-tests.sh --verbose
 support/mesos-style.py
 
 
 Thanks,
 
 Ankur Chauhan
 




Re: Review Request 27483: Fetcher uses hadoop to fetch URIs regardless of the url scheme.

2014-11-05 Thread Ankur Chauhan

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/27483/
---

(Updated Nov. 6, 2014, 3:18 a.m.)


Review request for mesos, Timothy Chen, Timothy St. Clair, and Vinod Kone.


Changes
---

Bug fix: If Hadoop client is available but a version check returns a non-zero 
status code we should not continue with the fetcher. @ben


Bugs: MESOS-1711
https://issues.apache.org/jira/browse/MESOS-1711


Repository: mesos-git


Description
---

Previously, the fetcher used a hardcoded list of schemes to determine what URIs 
could be fetched by hadoop (if available). This is now changed such that we 
first check if hadoop can fetch them for us and then we fallback to the os::net 
and then a local copy method (same as it used to be). This allows users to 
fetch artifacts from arbitrary filesystems as long as hadoop is correctly 
configured (in core-site.xml).


Diffs (updated)
-

  src/hdfs/hdfs.hpp bbfeddef106c598d8379ced085ef0605c4b2f380 
  src/launcher/fetcher.cpp 9323c28237010fa065ef34d74435c151ded530a8 
  src/tests/fetcher_tests.cpp d7754009a59fedb43e3422c56b3a786ce80164aa 

Diff: https://reviews.apache.org/r/27483/diff/


Testing
---

make check
sudo bin/mesos-tests.sh --verbose
support/mesos-style.py


Thanks,

Ankur Chauhan



Re: Review Request 27483: Fetcher uses hadoop to fetch URIs regardless of the url scheme.

2014-11-05 Thread Ankur Chauhan

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/27483/
---

(Updated Nov. 6, 2014, 4:12 a.m.)


Review request for mesos, Timothy Chen, Timothy St. Clair, and Vinod Kone.


Changes
---

Resolve merge conflicts


Bugs: MESOS-1711
https://issues.apache.org/jira/browse/MESOS-1711


Repository: mesos-git


Description
---

Previously, the fetcher used a hardcoded list of schemes to determine what URIs 
could be fetched by hadoop (if available). This is now changed such that we 
first check if hadoop can fetch them for us and then we fallback to the os::net 
and then a local copy method (same as it used to be). This allows users to 
fetch artifacts from arbitrary filesystems as long as hadoop is correctly 
configured (in core-site.xml).


Diffs (updated)
-

  src/launcher/fetcher.cpp bd95928bc3191970330e839bcf41e343d5142c54 

Diff: https://reviews.apache.org/r/27483/diff/


Testing
---

make check
sudo bin/mesos-tests.sh --verbose
support/mesos-style.py


Thanks,

Ankur Chauhan



Review Request 27670: Bug fix: Check for non-zero status code and hadoop client not found

2014-11-05 Thread Ankur Chauhan

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/27670/
---

Review request for mesos and Vinod Kone.


Repository: mesos-git


Description
---

Bug fix: Check for non-zero status code and hadoop client not found


Diffs
-

  src/launcher/fetcher.cpp bd95928bc3191970330e839bcf41e343d5142c54 

Diff: https://reviews.apache.org/r/27670/diff/


Testing
---

make check
sudo bin/mesos-tests.sh --verbose
support/mesos-style.py


Thanks,

Ankur Chauhan



Re: Review Request 27670: Bug fix: Check for non-zero status code and hadoop client not found

2014-11-05 Thread Ankur Chauhan

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/27670/
---

(Updated Nov. 6, 2014, 6:25 a.m.)


Review request for mesos, Benjamin Hindman and Vinod Kone.


Repository: mesos-git


Description (updated)
---

@benh noticed that the fetcher would detect the condition where hadoop returns 
a success in case of a success non zero value


Diffs
-

  src/launcher/fetcher.cpp bd95928bc3191970330e839bcf41e343d5142c54 

Diff: https://reviews.apache.org/r/27670/diff/


Testing
---

make check
sudo bin/mesos-tests.sh --verbose
support/mesos-style.py


Thanks,

Ankur Chauhan



Re: Review Request 27670: Bug fix: Check for non-zero status code and hadoop client not found

2014-11-05 Thread Ankur Chauhan

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/27670/
---

(Updated Nov. 6, 2014, 7:07 a.m.)


Review request for mesos, Benjamin Hindman and Vinod Kone.


Repository: mesos-git


Description
---

@benh noticed that the fetcher would detect the condition where hadoop returns 
a success in case of a success non zero value


Diffs
-

  src/launcher/fetcher.cpp bd95928bc3191970330e839bcf41e343d5142c54 

Diff: https://reviews.apache.org/r/27670/diff/


Testing
---

make check
sudo bin/mesos-tests.sh --verbose
support/mesos-style.py


Thanks,

Ankur Chauhan



Re: Why rely on url scheme for fetching?

2014-11-04 Thread Ankur Chauhan
Hi ben,

Thanks for the follow up. I think thats a perfectly fine game plan. I think I 
can break down things into smaller, more isolated chunks. But from the looks of 
MESOS-1316 the invocation code and the testing code seems to change a lot (in a 
good way), so to avoid a bunch of wasted cycles, I was going to hold off a 
little till it is committed and then go on to refactoring the transport parts. 
The json config change (MESOS-1248) is completely disconnected to this so I am 
not too worried about that one.
The only place where some coordination may be required is the 
launcher/fetcher.cpp. It seems there are common changes happening between my  
change https://reviews.apache.org/r/27483/ 
https://reviews.apache.org/r/27483/ and your 
https://reviews.apache.org/r/21316 https://reviews.apache.org/r/21316 which 
would mean a little rework but it's totally within reasonable limits.
I think if you get the MESOS-1316 committed and I get MESOS-1711 ( 
https://reviews.apache.org/r/27483 https://reviews.apache.org/r/27483 ) in, 
both of us would get unblocked in a way that we can get a way that lets us 
concurrently work on MESOS-336 and refactoring fetcher into more testable parts.

Does this make sense to you or did I misunderstand some part?

-- Ankur

 On 4 Nov 2014, at 02:05, Bernd Mathiske be...@mesosphere.io wrote:
 
 Typo: not - note
 
 On Nov 4, 2014, at 10:59 AM, Bernd Mathiske be...@mesosphere.io wrote:
 
 Hi Ankur,
 
 I like it, too. However, I cannot refrain from relaying (not my choice of 
 word here) to you the advice to break your relatively large patch down into 
 smaller parts. My patch for MESOS-336 certainly was, as I know now. My plan 
 is to get MESOS-1316 (which is now under review) and MESOS-1248 (which 
 should be quick) committed and then rebase MESOS-336 in smaller pieces. If 
 you have small pieces, too, I do think we can do this somewhat concurrently, 
 as your refactoring seems to affect mostly how the transport part works. I 
 am on the other hand mostly interested in the bookkeeping of what is in 
 progress, what succeeded, what failed, what has been put where, and so on, 
 which can be separated, if we are careful about it.
 
 (BTW, I think we need both unit tests and integration tests. And we don’t 
 have nearly enough of either.)
 
 Bernd
 
 On Nov 3, 2014, at 6:18 PM, Adam Bordelon a...@mesosphere.io 
 mailto:a...@mesosphere.io wrote:
 
 + Bernd, who has done some fetcher work, including additional testing, for 
 MESOS-1316, MESOS-1945, and MESOS-336
 
 On Mon, Nov 3, 2014 at 9:04 AM, Dominic Hamon dha...@twopensource.com 
 mailto:dha...@twopensource.com wrote:
 Hi Ankur
 
 I think this is a great approach. It makes the code much simpler, 
 extensible, and more testable. Anyone that's heard me rant knows I am a big 
 fan of unit tests over integration tests, so this shouldn't surprise anyone 
 :)
 
 If you haven't already, please read the documentation on contributing to 
 Mesos and the style guide to ensure all the naming is as expected, then you 
 can push the patch to reviewboard to get it reviewed and committed.
 
 On Mon, Nov 3, 2014 at 12:49 AM, Ankur Chauhan an...@malloc64.com 
 mailto:an...@malloc64.com wrote:
 Hi,
 
 I did some learning today! This is pretty much a very rough draft of the 
 tests/refactor of mesos-fetcher that I have come up with. Again, If there 
 are some obvious mistakes, please let me know. (this is my first pass after 
 all).
 https://github.com/ankurcha/mesos/compare/prefer_2 
 https://github.com/ankurcha/mesos/compare/prefer_2
 
 My main intention is to break the logic of the fetcher info some very 
 discrete components that i can write tests against. I am still re-learning 
 cpp/mesos code styles etc so I may be a little slow to catch up but I would 
 really appreciate any comments and/or suggestions.
 
 -- Ankur
 @ankurcha
 
 On 2 Nov 2014, at 18:17, Ankur Chauhan an...@malloc64.com 
 mailto:an...@malloc64.com wrote:
 
 Hi,
 
 I noticed that the current set of tests in `src/tests/fetcher_tests.cpp` 
 is pretty coarse grained and are more on the lines of a functional test. I 
 was going to add some tests but it seems like if I am to do that I would 
 need to add a test dependency on hadoop. 
 
 As an alternative, I propose adding a good set of unit tests around the 
 methods used by `src/launcher/fetcher.cpp` and `src/hdfs/hdfs.cpp`. This 
 should be able to catch a good portion of cases at the same time keeping 
 the dependencies and runtime of tests low. What do you guys thing about 
 this?
 
 PS: I am pretty green in terms of gtest and the overall c++ testing 
 methodology. Can someone give me pointers to good examples of tests in the 
 codebase.
 
 -- Ankur
 
 On 1 Nov 2014, at 22:54, Adam Bordelon a...@mesosphere.io 
 mailto:a...@mesosphere.io wrote:
 
 Thank you Ankur. At first glance, it looks great. We'll do a more 
 thorough review of it very soon.
 I know Tim St. Clair had some ideas for fixing MESOS-1711 
 https

Re: Why rely on url scheme for fetching?

2014-11-04 Thread Ankur Chauhan
Hi,

Sorry about that bernd! :s/ben/bernd/g

-- ankur
Sent from my iPhone

 On Nov 4, 2014, at 3:13 AM, Bernd Mathiske be...@mesosphere.io wrote:
 
 Hi Ankur, 
 
 I am Bernd, not Ben, but I’ll try to do my best :-)
 
 Your plan looks good to me and your patch for MESOS-1711 seems uncomplicated 
 enough to not cause me major problems no matter when it lands. 
 
 Bernd
 
 On Nov 4, 2014, at 11:44 AM, Ankur Chauhan an...@malloc64.com wrote:
 
 Hi ben,
 
 Thanks for the follow up. I think thats a perfectly fine game plan. I think 
 I can break down things into smaller, more isolated chunks. But from the 
 looks of MESOS-1316 the invocation code and the testing code seems to change 
 a lot (in a good way), so to avoid a bunch of wasted cycles, I was going to 
 hold off a little till it is committed and then go on to refactoring the 
 transport parts. The json config change (MESOS-1248) is completely 
 disconnected to this so I am not too worried about that one.
 The only place where some coordination may be required is the 
 launcher/fetcher.cpp. It seems there are common changes happening between my 
  change https://reviews.apache.org/r/27483/ 
 https://reviews.apache.org/r/27483/https://reviews.apache.org/r/27483/ 
 https://reviews.apache.org/r/27483/ and your 
 https://reviews.apache.org/r/21316 
 https://reviews.apache.org/r/21316https://reviews.apache.org/r/21316 
 https://reviews.apache.org/r/21316 which would mean a little rework but 
 it's totally within reasonable limits.
 I think if you get the MESOS-1316 committed and I get MESOS-1711 ( 
 https://reviews.apache.org/r/27483 https://reviews.apache.org/r/27483 
 https://reviews.apache.org/r/27483 https://reviews.apache.org/r/27483 ) 
 in, both of us would get unblocked in a way that we can get a way that lets 
 us concurrently work on MESOS-336 and refactoring fetcher into more testable 
 parts.
 
 Does this make sense to you or did I misunderstand some part?
 
 -- Ankur
 
 On 4 Nov 2014, at 02:05, Bernd Mathiske be...@mesosphere.io 
 mailto:be...@mesosphere.io wrote:
 
 Typo: not - note
 
 On Nov 4, 2014, at 10:59 AM, Bernd Mathiske be...@mesosphere.io 
 mailto:be...@mesosphere.io wrote:
 
 Hi Ankur,
 
 I like it, too. However, I cannot refrain from relaying (not my choice of 
 word here) to you the advice to break your relatively large patch down 
 into smaller parts. My patch for MESOS-336 certainly was, as I know now. 
 My plan is to get MESOS-1316 (which is now under review) and MESOS-1248 
 (which should be quick) committed and then rebase MESOS-336 in smaller 
 pieces. If you have small pieces, too, I do think we can do this somewhat 
 concurrently, as your refactoring seems to affect mostly how the transport 
 part works. I am on the other hand mostly interested in the bookkeeping of 
 what is in progress, what succeeded, what failed, what has been put where, 
 and so on, which can be separated, if we are careful about it.
 
 (BTW, I think we need both unit tests and integration tests. And we don’t 
 have nearly enough of either.)
 
 Bernd
 
 On Nov 3, 2014, at 6:18 PM, Adam Bordelon a...@mesosphere.io 
 mailto:a...@mesosphere.iomailto:a...@mesosphere.io 
 mailto:a...@mesosphere.io wrote:
 
 + Bernd, who has done some fetcher work, including additional testing, 
 for MESOS-1316, MESOS-1945, and MESOS-336
 
 On Mon, Nov 3, 2014 at 9:04 AM, Dominic Hamon dha...@twopensource.com 
 mailto:dha...@twopensource.commailto:dha...@twopensource.com 
 mailto:dha...@twopensource.com wrote:
 Hi Ankur
 
 I think this is a great approach. It makes the code much simpler, 
 extensible, and more testable. Anyone that's heard me rant knows I am a 
 big fan of unit tests over integration tests, so this shouldn't surprise 
 anyone :)
 
 If you haven't already, please read the documentation on contributing to 
 Mesos and the style guide to ensure all the naming is as expected, then 
 you can push the patch to reviewboard to get it reviewed and committed.
 
 On Mon, Nov 3, 2014 at 12:49 AM, Ankur Chauhan an...@malloc64.com 
 mailto:an...@malloc64.commailto:an...@malloc64.com 
 mailto:an...@malloc64.com wrote:
 Hi,
 
 I did some learning today! This is pretty much a very rough draft of the 
 tests/refactor of mesos-fetcher that I have come up with. Again, If there 
 are some obvious mistakes, please let me know. (this is my first pass 
 after all).
 https://github.com/ankurcha/mesos/compare/prefer_2 
 https://github.com/ankurcha/mesos/compare/prefer_2https://github.com/ankurcha/mesos/compare/prefer_2
  https://github.com/ankurcha/mesos/compare/prefer_2
 
 My main intention is to break the logic of the fetcher info some very 
 discrete components that i can write tests against. I am still 
 re-learning cpp/mesos code styles etc so I may be a little slow to catch 
 up but I would really appreciate any comments and/or suggestions.
 
 -- Ankur
 @ankurcha
 
 On 2 Nov 2014, at 18:17, Ankur Chauhan an...@malloc64.com 
 mailto:an...@malloc64.commailto:an...@malloc64.com

Re: Review Request 27483: Fetcher uses hadoop to fetch URIs regardless of the url scheme.

2014-11-04 Thread Ankur Chauhan


 On Nov. 3, 2014, 9:59 p.m., Timothy St. Clair wrote:
  src/launcher/fetcher.cpp, line 77
  https://reviews.apache.org/r/27483/diff/1/?file=746870#file746870line77
 
  we should probably bail here, if somehow the return is != 0  (isError() 
  || false)
 
 Ankur Chauhan wrote:
 My thinking here was: In case of a user that does not have hadoop_home 
 set and no hadoop in path, we would like to continue with other methods. If 
 we bail here for whatever reason (no hadoop, misconfigured hadoop etc), we 
 would end up breaking the fetcher for all those people. - This holds only if 
 we assume that hadoop/hdfs is **not** a hard dependency.
 
 Timothy St. Clair wrote:
 We should probably add a comment then to outline the implied 
 assumption(s).

I have added a comment detailing this.


- Ankur


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/27483/#review59644
---


On Nov. 4, 2014, 1:04 a.m., Ankur Chauhan wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/27483/
 ---
 
 (Updated Nov. 4, 2014, 1:04 a.m.)
 
 
 Review request for mesos and Timothy St. Clair.
 
 
 Bugs: MESOS-1711
 https://issues.apache.org/jira/browse/MESOS-1711
 
 
 Repository: mesos-git
 
 
 Description
 ---
 
 Previously, the fetcher used a hardcoded list of schemes to determine what 
 URIs could be fetched by hadoop (if available). This is now changed such that 
 we first check if hadoop can fetch them for us and then we fallback to the 
 os::net and then a local copy method (same as it used to be). This allows 
 users to fetch artifacts from arbitrary filesystems as long as hadoop is 
 correctly configured (in core-site.xml).
 
 
 Diffs
 -
 
   src/hdfs/hdfs.hpp bbfeddef106c598d8379ced085ef0605c4b2f380 
   src/launcher/fetcher.cpp 9323c28237010fa065ef34d74435c151ded530a8 
 
 Diff: https://reviews.apache.org/r/27483/diff/
 
 
 Testing
 ---
 
 make check
 sudo bin/mesos-tests.sh --verbose
 support/mesos-style.py
 
 
 Thanks,
 
 Ankur Chauhan
 




Re: Review Request 27483: Fetcher uses hadoop to fetch URIs regardless of the url scheme.

2014-11-04 Thread Ankur Chauhan

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/27483/
---

(Updated Nov. 4, 2014, 6:25 p.m.)


Review request for mesos and Timothy St. Clair.


Changes
---

Add comments about hadoop client dependency.


Bugs: MESOS-1711
https://issues.apache.org/jira/browse/MESOS-1711


Repository: mesos-git


Description
---

Previously, the fetcher used a hardcoded list of schemes to determine what URIs 
could be fetched by hadoop (if available). This is now changed such that we 
first check if hadoop can fetch them for us and then we fallback to the os::net 
and then a local copy method (same as it used to be). This allows users to 
fetch artifacts from arbitrary filesystems as long as hadoop is correctly 
configured (in core-site.xml).


Diffs (updated)
-

  src/hdfs/hdfs.hpp bbfeddef106c598d8379ced085ef0605c4b2f380 
  src/launcher/fetcher.cpp 9323c28237010fa065ef34d74435c151ded530a8 

Diff: https://reviews.apache.org/r/27483/diff/


Testing
---

make check
sudo bin/mesos-tests.sh --verbose
support/mesos-style.py


Thanks,

Ankur Chauhan



Re: Review Request 27483: Fetcher uses hadoop to fetch URIs regardless of the url scheme.

2014-11-04 Thread Ankur Chauhan

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/27483/
---

(Updated Nov. 4, 2014, 7:47 p.m.)


Review request for mesos, Timothy Chen, Timothy St. Clair, and Vinod Kone.


Changes
---

updating reviewers to reflect.


Bugs: MESOS-1711
https://issues.apache.org/jira/browse/MESOS-1711


Repository: mesos-git


Description
---

Previously, the fetcher used a hardcoded list of schemes to determine what URIs 
could be fetched by hadoop (if available). This is now changed such that we 
first check if hadoop can fetch them for us and then we fallback to the os::net 
and then a local copy method (same as it used to be). This allows users to 
fetch artifacts from arbitrary filesystems as long as hadoop is correctly 
configured (in core-site.xml).


Diffs
-

  src/hdfs/hdfs.hpp bbfeddef106c598d8379ced085ef0605c4b2f380 
  src/launcher/fetcher.cpp 9323c28237010fa065ef34d74435c151ded530a8 

Diff: https://reviews.apache.org/r/27483/diff/


Testing
---

make check
sudo bin/mesos-tests.sh --verbose
support/mesos-style.py


Thanks,

Ankur Chauhan



Re: Review Request 27483: Fetcher uses hadoop to fetch URIs regardless of the url scheme.

2014-11-04 Thread Ankur Chauhan


 On Nov. 4, 2014, 9:46 p.m., Vinod Kone wrote:
  src/launcher/fetcher.cpp, line 223
  https://reviews.apache.org/r/27483/diff/7/?file=748389#file748389line223
 
  Why fall through if result.isError()?

The way I am thinking of the logic of fetchers is on the lines of we try all 
fetchers one by one till we succeed *even* if an unsuccessful fetcher can 
handle it. This means that if for any reason (maybe mesos-fetcher is running 
doesn't have permissions or whatever be the case) a fetcher fails we always go 
to the next one. I understand that a case can be made to do something on the 
lines of 

```
if(local.canHandle(uri)) {
  return local.fetch(uri);
} else if (os_net.canHandle(uri)) {
  return os_net.fetch(uri);
} else if ( hadoop.canHandle(uri)) {
  return hadoop.fetch(uri);
} else {
  return Error(Can't download uri: + uri);
}
```

But this way we are assuming that local fetcher has all possible permissions to 
handle local/file uris *and* os_net has all the configuration to handle all 
http(s) | ftp/ftp(s) uris and takes away the fallback. It may be the case that 
the hadoop fetcher (and hence hadoop client) has some credentials in its config 
files that aren't present in the urls themselves. An example of this is if ftp 
uris are provided without user or password but core-site.xml has the required 
user/password in it.

Does this answer your concern?


- Ankur


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/27483/#review59842
---


On Nov. 4, 2014, 7:47 p.m., Ankur Chauhan wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/27483/
 ---
 
 (Updated Nov. 4, 2014, 7:47 p.m.)
 
 
 Review request for mesos, Timothy Chen, Timothy St. Clair, and Vinod Kone.
 
 
 Bugs: MESOS-1711
 https://issues.apache.org/jira/browse/MESOS-1711
 
 
 Repository: mesos-git
 
 
 Description
 ---
 
 Previously, the fetcher used a hardcoded list of schemes to determine what 
 URIs could be fetched by hadoop (if available). This is now changed such that 
 we first check if hadoop can fetch them for us and then we fallback to the 
 os::net and then a local copy method (same as it used to be). This allows 
 users to fetch artifacts from arbitrary filesystems as long as hadoop is 
 correctly configured (in core-site.xml).
 
 
 Diffs
 -
 
   src/hdfs/hdfs.hpp bbfeddef106c598d8379ced085ef0605c4b2f380 
   src/launcher/fetcher.cpp 9323c28237010fa065ef34d74435c151ded530a8 
 
 Diff: https://reviews.apache.org/r/27483/diff/
 
 
 Testing
 ---
 
 make check
 sudo bin/mesos-tests.sh --verbose
 support/mesos-style.py
 
 
 Thanks,
 
 Ankur Chauhan
 




Re: Review Request 27483: Fetcher uses hadoop to fetch URIs regardless of the url scheme.

2014-11-04 Thread Ankur Chauhan

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/27483/
---

(Updated Nov. 4, 2014, 10:20 p.m.)


Review request for mesos, Timothy Chen, Timothy St. Clair, and Vinod Kone.


Changes
---

Reformat comments to be 70 cols width


Bugs: MESOS-1711
https://issues.apache.org/jira/browse/MESOS-1711


Repository: mesos-git


Description
---

Previously, the fetcher used a hardcoded list of schemes to determine what URIs 
could be fetched by hadoop (if available). This is now changed such that we 
first check if hadoop can fetch them for us and then we fallback to the os::net 
and then a local copy method (same as it used to be). This allows users to 
fetch artifacts from arbitrary filesystems as long as hadoop is correctly 
configured (in core-site.xml).


Diffs (updated)
-

  src/hdfs/hdfs.hpp bbfeddef106c598d8379ced085ef0605c4b2f380 
  src/launcher/fetcher.cpp 9323c28237010fa065ef34d74435c151ded530a8 

Diff: https://reviews.apache.org/r/27483/diff/


Testing
---

make check
sudo bin/mesos-tests.sh --verbose
support/mesos-style.py


Thanks,

Ankur Chauhan



Re: Review Request 27483: Fetcher uses hadoop to fetch URIs regardless of the url scheme.

2014-11-04 Thread Ankur Chauhan


 On Nov. 4, 2014, 9:46 p.m., Vinod Kone wrote:
  src/launcher/fetcher.cpp, line 223
  https://reviews.apache.org/r/27483/diff/7/?file=748389#file748389line223
 
  Why fall through if result.isError()?
 
 Ankur Chauhan wrote:
 The way I am thinking of the logic of fetchers is on the lines of we try 
 all fetchers one by one till we succeed *even* if an unsuccessful fetcher can 
 handle it. This means that if for any reason (maybe mesos-fetcher is running 
 doesn't have permissions or whatever be the case) a fetcher fails we always 
 go to the next one. I understand that a case can be made to do something on 
 the lines of 
 
 ```
 if(local.canHandle(uri)) {
   return local.fetch(uri);
 } else if (os_net.canHandle(uri)) {
   return os_net.fetch(uri);
 } else if ( hadoop.canHandle(uri)) {
   return hadoop.fetch(uri);
 } else {
   return Error(Can't download uri: + uri);
 }
 ```
 
 But this way we are assuming that local fetcher has all possible 
 permissions to handle local/file uris *and* os_net has all the configuration 
 to handle all http(s) | ftp/ftp(s) uris and takes away the fallback. It may 
 be the case that the hadoop fetcher (and hence hadoop client) has some 
 credentials in its config files that aren't present in the urls themselves. 
 An example of this is if ftp uris are provided without user or password but 
 core-site.xml has the required user/password in it.
 
 Does this answer your concern?
 
 Vinod Kone wrote:
 I'm concerned about how the logs will look for an user when fetching 
 URIs. If it's a non-local URI they would always see a message and failure 
 with local copy before using HDFS or Net, which is bad.
 
 I would much rather make it explicit what you are doing, with comments.
 
 ```
 if (local.canHandle(uri) {
   result = local.fetch(uri);
   if (result.isSome()) {
 return result;
   } else {
 LOG(WARNING)  Failed to copy local file:   result.Error()  ;
   }
 } else if (os_net.canHandle(uri)) {
   result = os_net.fetch(uri);
   if (result.isSome()) {
 return result;
   } else {
 LOG(WARNING)  Failed to download URI:   result.Error()  ;
   }
 }
 
 // If we are here, one of the following is possible.
 // 1. Failed to copy a local URI. 
 // 2. Failed to download a remote URI using Os.net.
 // 3. HDFS compatible URI.
 // 4. Unexpected URI.
 // For all these cases we try to fetch the URI using Hadoop client as a 
 fallback.
 
 return hadoop.fetch(uri);
 
 ```
 
 More importantly, are there really file permission issues where cp 
 doesn't work but hadoop fs copyToLocal works? Can you give me a concrete 
 example? I ask because if the hadoop client hangs for whatever reason (we 
 have seen this in production at Twitter), it will take upto the executor 
 registration timeout for the slave to kill the executor and transition the 
 task to LOST. Getting rid of hadoop dependency was one of the reasons we 
 install our executors locally. This change adds the dependency back on 
 hadoop, even for local files, which is unfortunate.

That does make sense, I hadn't considered the logging :-( and I am all up for 
no hard external dependencies if I can avoid it. Let me think a little on that.


- Ankur


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/27483/#review59842
---


On Nov. 4, 2014, 10:20 p.m., Ankur Chauhan wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/27483/
 ---
 
 (Updated Nov. 4, 2014, 10:20 p.m.)
 
 
 Review request for mesos, Timothy Chen, Timothy St. Clair, and Vinod Kone.
 
 
 Bugs: MESOS-1711
 https://issues.apache.org/jira/browse/MESOS-1711
 
 
 Repository: mesos-git
 
 
 Description
 ---
 
 Previously, the fetcher used a hardcoded list of schemes to determine what 
 URIs could be fetched by hadoop (if available). This is now changed such that 
 we first check if hadoop can fetch them for us and then we fallback to the 
 os::net and then a local copy method (same as it used to be). This allows 
 users to fetch artifacts from arbitrary filesystems as long as hadoop is 
 correctly configured (in core-site.xml).
 
 
 Diffs
 -
 
   src/hdfs/hdfs.hpp bbfeddef106c598d8379ced085ef0605c4b2f380 
   src/launcher/fetcher.cpp 9323c28237010fa065ef34d74435c151ded530a8 
 
 Diff: https://reviews.apache.org/r/27483/diff/
 
 
 Testing
 ---
 
 make check
 sudo bin/mesos-tests.sh --verbose
 support/mesos-style.py
 
 
 Thanks,
 
 Ankur Chauhan
 




Re: Review Request 27483: Fetcher uses hadoop to fetch URIs regardless of the url scheme.

2014-11-04 Thread Ankur Chauhan

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/27483/
---

(Updated Nov. 5, 2014, 6:39 a.m.)


Review request for mesos, Timothy Chen, Timothy St. Clair, and Vinod Kone.


Changes
---

Do compatibility check before using a fetcher, add tests.


Bugs: MESOS-1711
https://issues.apache.org/jira/browse/MESOS-1711


Repository: mesos-git


Description
---

Previously, the fetcher used a hardcoded list of schemes to determine what URIs 
could be fetched by hadoop (if available). This is now changed such that we 
first check if hadoop can fetch them for us and then we fallback to the os::net 
and then a local copy method (same as it used to be). This allows users to 
fetch artifacts from arbitrary filesystems as long as hadoop is correctly 
configured (in core-site.xml).


Diffs (updated)
-

  src/hdfs/hdfs.hpp bbfeddef106c598d8379ced085ef0605c4b2f380 
  src/launcher/fetcher.cpp 9323c28237010fa065ef34d74435c151ded530a8 
  src/tests/fetcher_tests.cpp d7754009a59fedb43e3422c56b3a786ce80164aa 

Diff: https://reviews.apache.org/r/27483/diff/


Testing
---

make check
sudo bin/mesos-tests.sh --verbose
support/mesos-style.py


Thanks,

Ankur Chauhan



Re: Review Request 27483: Fetcher uses hadoop to fetch URIs regardless of the url scheme.

2014-11-04 Thread Ankur Chauhan


 On Nov. 4, 2014, 9:46 p.m., Vinod Kone wrote:
  src/launcher/fetcher.cpp, line 223
  https://reviews.apache.org/r/27483/diff/7/?file=748389#file748389line223
 
  Why fall through if result.isError()?
 
 Ankur Chauhan wrote:
 The way I am thinking of the logic of fetchers is on the lines of we try 
 all fetchers one by one till we succeed *even* if an unsuccessful fetcher can 
 handle it. This means that if for any reason (maybe mesos-fetcher is running 
 doesn't have permissions or whatever be the case) a fetcher fails we always 
 go to the next one. I understand that a case can be made to do something on 
 the lines of 
 
 ```
 if(local.canHandle(uri)) {
   return local.fetch(uri);
 } else if (os_net.canHandle(uri)) {
   return os_net.fetch(uri);
 } else if ( hadoop.canHandle(uri)) {
   return hadoop.fetch(uri);
 } else {
   return Error(Can't download uri: + uri);
 }
 ```
 
 But this way we are assuming that local fetcher has all possible 
 permissions to handle local/file uris *and* os_net has all the configuration 
 to handle all http(s) | ftp/ftp(s) uris and takes away the fallback. It may 
 be the case that the hadoop fetcher (and hence hadoop client) has some 
 credentials in its config files that aren't present in the urls themselves. 
 An example of this is if ftp uris are provided without user or password but 
 core-site.xml has the required user/password in it.
 
 Does this answer your concern?
 
 Vinod Kone wrote:
 I'm concerned about how the logs will look for an user when fetching 
 URIs. If it's a non-local URI they would always see a message and failure 
 with local copy before using HDFS or Net, which is bad.
 
 I would much rather make it explicit what you are doing, with comments.
 
 ```
 if (local.canHandle(uri) {
   result = local.fetch(uri);
   if (result.isSome()) {
 return result;
   } else {
 LOG(WARNING)  Failed to copy local file:   result.Error()  ;
   }
 } else if (os_net.canHandle(uri)) {
   result = os_net.fetch(uri);
   if (result.isSome()) {
 return result;
   } else {
 LOG(WARNING)  Failed to download URI:   result.Error()  ;
   }
 }
 
 // If we are here, one of the following is possible.
 // 1. Failed to copy a local URI. 
 // 2. Failed to download a remote URI using Os.net.
 // 3. HDFS compatible URI.
 // 4. Unexpected URI.
 // For all these cases we try to fetch the URI using Hadoop client as a 
 fallback.
 
 return hadoop.fetch(uri);
 
 ```
 
 More importantly, are there really file permission issues where cp 
 doesn't work but hadoop fs copyToLocal works? Can you give me a concrete 
 example? I ask because if the hadoop client hangs for whatever reason (we 
 have seen this in production at Twitter), it will take upto the executor 
 registration timeout for the slave to kill the executor and transition the 
 task to LOST. Getting rid of hadoop dependency was one of the reasons we 
 install our executors locally. This change adds the dependency back on 
 hadoop, even for local files, which is unfortunate.
 
 Ankur Chauhan wrote:
 That does make sense, I hadn't considered the logging :-( and I am all up 
 for no hard external dependencies if I can avoid it. Let me think a little on 
 that.

The latest patch should take care of this. I am following the same approach as 
above.


- Ankur


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/27483/#review59842
---


On Nov. 5, 2014, 6:39 a.m., Ankur Chauhan wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/27483/
 ---
 
 (Updated Nov. 5, 2014, 6:39 a.m.)
 
 
 Review request for mesos, Timothy Chen, Timothy St. Clair, and Vinod Kone.
 
 
 Bugs: MESOS-1711
 https://issues.apache.org/jira/browse/MESOS-1711
 
 
 Repository: mesos-git
 
 
 Description
 ---
 
 Previously, the fetcher used a hardcoded list of schemes to determine what 
 URIs could be fetched by hadoop (if available). This is now changed such that 
 we first check if hadoop can fetch them for us and then we fallback to the 
 os::net and then a local copy method (same as it used to be). This allows 
 users to fetch artifacts from arbitrary filesystems as long as hadoop is 
 correctly configured (in core-site.xml).
 
 
 Diffs
 -
 
   src/hdfs/hdfs.hpp bbfeddef106c598d8379ced085ef0605c4b2f380 
   src/launcher/fetcher.cpp 9323c28237010fa065ef34d74435c151ded530a8 
   src/tests/fetcher_tests.cpp d7754009a59fedb43e3422c56b3a786ce80164aa 
 
 Diff: https://reviews.apache.org/r/27483/diff/
 
 
 Testing
 ---
 
 make check

Re: Review Request 27483: Fetcher uses hadoop to fetch URIs regardless of the url scheme.

2014-11-04 Thread Ankur Chauhan

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/27483/#review59934
---


This may be a side issue but I noticed in a run of `bin/mesos-tests.sh 
--gtest_filter=FetcherTest.*`:

```
$ ./bin/mesos-tests.sh --gtest_filter=FetcherTest.*
Source directory: /Users/achauhan/Projects/mesos
Build directory: /Users/achauhan/Projects/mesos/build
-
We cannot run any Docker tests because:
Docker tests not supported on non-Linux systems
-
Note: Google Test filter = 
FetcherTest.*-DockerContainerizerTest.ROOT_DOCKER_Launch:DockerContainerizerTest.ROOT_DOCKER_Kill:DockerContainerizerTest.ROOT_DOCKER_Usage:DockerContainerizerTest.DISABLED_ROOT_DOCKER_Recover:DockerContainerizerTest.ROOT_DOCKER_Logs:DockerContainerizerTest.ROOT_DOCKER_Default_CMD:DockerContainerizerTest.ROOT_DOCKER_Default_CMD_Override:DockerContainerizerTest.ROOT_DOCKER_Default_CMD_Args:DockerContainerizerTest.ROOT_DOCKER_SlaveRecoveryTaskContainer:DockerContainerizerTest.DISABLED_ROOT_DOCKER_SlaveRecoveryExecutorContainer:DockerContainerizerTest.ROOT_DOCKER_PortMapping:DockerContainerizerTest.ROOT_DOCKER_LaunchSandboxWithColon:DockerTest.ROOT_DOCKER_interface:DockerTest.ROOT_DOCKER_CheckCommandWithShell:DockerTest.ROOT_DOCKER_CheckPortResource:DockerTest.ROOT_DOCKER_CancelPull:SlaveTest.ROOT_RunTaskWithCommandInfoWithoutUser:SlaveTest.DISABLED_ROOT_RunTaskWithCommandInfoWithUser:SlaveCount/Registrar_BENCHMARK_Test.performance/0:SlaveCount/Registrar_BENCH
 
MARK_Test.performance/1:SlaveCount/Registrar_BENCHMARK_Test.performance/2:SlaveCount/Registrar_BENCHMARK_Test.performance/3
[==] Running 4 tests from 1 test case.
[--] Global test environment set-up.
[--] 4 tests from FetcherTest
[ RUN  ] FetcherTest.FileURI
WARNING: Logging before InitGoogleLogging() is written to STDERR
I1104 22:43:38.412901 2068996864 fetcher.cpp:197] Fetching URI 
'file:///private/tmp/FetcherTest_FileURI_ipAYlv/from/test'
I1104 22:43:38.413653 2068996864 fetcher.cpp:178] Copying resource from 
'/private/tmp/FetcherTest_FileURI_ipAYlv/from/test' to 
'/private/tmp/FetcherTest_FileURI_ipAYlv'
I1104 22:43:38.419317 2068996864 fetcher.cpp:300] Skipped extracting path 
'/private/tmp/FetcherTest_FileURI_ipAYlv/test'
[   OK ] FetcherTest.FileURI (110 ms)
[ RUN  ] FetcherTest.FilePath
WARNING: Logging before InitGoogleLogging() is written to STDERR
I1104 22:43:38.507416 2068996864 fetcher.cpp:197] Fetching URI 
'/private/tmp/FetcherTest_FilePath_lAzOZG/from/test'
I1104 22:43:38.508020 2068996864 fetcher.cpp:178] Copying resource from 
'/private/tmp/FetcherTest_FilePath_lAzOZG/from/test' to 
'/private/tmp/FetcherTest_FilePath_lAzOZG'
I1104 22:43:38.512864 2068996864 fetcher.cpp:300] Skipped extracting path 
'/private/tmp/FetcherTest_FilePath_lAzOZG/test'
[   OK ] FetcherTest.FilePath (100 ms)
[ RUN  ] FetcherTest.OSNetUriTest
WARNING: Logging before InitGoogleLogging() is written to STDERR
I1104 22:43:38.608177 2068996864 fetcher.cpp:197] Fetching URI 
'http://www.example.com/index.html'
I1104 22:43:38.608901 2068996864 fetcher.cpp:109] Fetching URI 
'http://www.example.com/index.html' with os::net
I1104 22:43:38.608924 2068996864 fetcher.cpp:119] Downloading 
'http://www.example.com/index.html' to 
'/private/tmp/FetcherTest_OSNetUriTest_ZB4nAg/index.html'
I1104 22:43:38.619645 2068996864 fetcher.cpp:300] Skipped extracting path 
'/private/tmp/FetcherTest_OSNetUriTest_ZB4nAg/index.html'
[   OK ] FetcherTest.OSNetUriTest (105 ms)
[ RUN  ] FetcherTest.FileLocalhostURI
WARNING: Logging before InitGoogleLogging() is written to STDERR
I1104 22:43:38.712461 2068996864 fetcher.cpp:197] Fetching URI 
'file://localhost/private/tmp/FetcherTest_FileLocalhostURI_m1wYnH/from/test'
I1104 22:43:38.713353 2068996864 fetcher.cpp:178] Copying resource from 
'/private/tmp/FetcherTest_FileLocalhostURI_m1wYnH/from/test' to 
'/private/tmp/FetcherTest_FileLocalhostURI_m1wYnH'
I1104 22:43:38.718541 2068996864 fetcher.cpp:300] Skipped extracting path 
'/private/tmp/FetcherTest_FileLocalhostURI_m1wYnH/test'
[   OK ] FetcherTest.FileLocalhostURI (103 ms)
[--] 4 tests from FetcherTest (419 ms total)

[--] Global test environment tear-down
[==] 4 tests from 1 test case ran. (434 ms total)
[  PASSED  ] 4 tests.

  YOU HAVE 5 DISABLED TESTS
```

There are warnings in the output:

```
WARNING: Logging before InitGoogleLogging() is written to STDERR
```

And looking at the code, I don't see a glog initialization codeblock. Should 
there be `google::InitGoogleLogging(argv[0]);` in the `main` method?

- Ankur Chauhan


On Nov. 5, 2014, 6:39 a.m., Ankur Chauhan wrote:
 
 ---
 This is an automatically generated e-mail

Re: Why rely on url scheme for fetching?

2014-11-03 Thread Ankur Chauhan
Hi,

I did some learning today! This is pretty much a very rough draft of the 
tests/refactor of mesos-fetcher that I have come up with. Again, If there are 
some obvious mistakes, please let me know. (this is my first pass after all).
https://github.com/ankurcha/mesos/compare/prefer_2 
https://github.com/ankurcha/mesos/compare/prefer_2

My main intention is to break the logic of the fetcher info some very discrete 
components that i can write tests against. I am still re-learning cpp/mesos 
code styles etc so I may be a little slow to catch up but I would really 
appreciate any comments and/or suggestions.

-- Ankur
@ankurcha

 On 2 Nov 2014, at 18:17, Ankur Chauhan an...@malloc64.com wrote:
 
 Hi,
 
 I noticed that the current set of tests in `src/tests/fetcher_tests.cpp` is 
 pretty coarse grained and are more on the lines of a functional test. I was 
 going to add some tests but it seems like if I am to do that I would need to 
 add a test dependency on hadoop. 
 
 As an alternative, I propose adding a good set of unit tests around the 
 methods used by `src/launcher/fetcher.cpp` and `src/hdfs/hdfs.cpp`. This 
 should be able to catch a good portion of cases at the same time keeping the 
 dependencies and runtime of tests low. What do you guys thing about this?
 
 PS: I am pretty green in terms of gtest and the overall c++ testing 
 methodology. Can someone give me pointers to good examples of tests in the 
 codebase.
 
 -- Ankur
 
 On 1 Nov 2014, at 22:54, Adam Bordelon a...@mesosphere.io 
 mailto:a...@mesosphere.io wrote:
 
 Thank you Ankur. At first glance, it looks great. We'll do a more thorough 
 review of it very soon.
 I know Tim St. Clair had some ideas for fixing MESOS-1711 
 https://issues.apache.org/jira/browse/MESOS-1711; he may want to review 
 too.
 
 On Sat, Nov 1, 2014 at 8:49 PM, Ankur Chauhan an...@malloc64.com 
 mailto:an...@malloc64.com wrote:
 Hi Tim,
 
 I just created a review https://reviews.apache.org/r/27483/ 
 https://reviews.apache.org/r/27483/ It's my first stab at it and I will 
 try to add more tests as I figure out how to do the hadoop mocking and 
 stuff. Have a look and let me know what you think about it so far.
 
 -- Ankur
 
 On 1 Nov 2014, at 20:05, Ankur Chauhan an...@malloc64.com 
 mailto:an...@malloc64.com wrote:
 
 Yea, i saw that the minute i pressed send. I'll start the review board so 
 that people can have a look at the change.
 
 -- Ankur
 
 On 1 Nov 2014, at 20:01, Tim Chen t...@mesosphere.io 
 mailto:t...@mesosphere.io wrote:
 
 Hi Ankur,
 
 There is a fetcher_tests.cpp in src/tests.
 
 Tim
 
 On Sat, Nov 1, 2014 at 7:27 PM, Ankur Chauhan an...@malloc64.com 
 mailto:an...@malloc64.com wrote:
 Hi Tim,
 
 I am trying to find/write some test cases. I couldn't find a 
 fetcher_tests.{cpp|hpp} so once I have something, I'll post on review 
 board. I am new to gmock/gtest so bear with me while i get up to speed.
 
 -- Ankur
 
 On 1 Nov 2014, at 19:23, Timothy Chen t...@mesosphere.io 
 mailto:t...@mesosphere.io wrote:
 
 Hi Ankur,
 
 Can you post on reviewboard? We can discuss more about the code there.
 
 Tim
 
 Sent from my iPhone
 
 On Nov 1, 2014, at 6:29 PM, Ankur Chauhan an...@malloc64.com 
 mailto:an...@malloc64.com wrote:
 
 Hi Tim,
 
 I don't think there is an issue which is directly in line with what i 
 wanted but the closest one that I could find in JIRA is 
 https://issues.apache.org/jira/browse/MESOS-1711 
 https://issues.apache.org/jira/browse/MESOS-1711
 
 I have a branch ( 
 https://github.com/ankurcha/mesos/compare/prefer_hadoop_fetcher 
 https://github.com/ankurcha/mesos/compare/prefer_hadoop_fetcher ) that 
 has a change that would enable users to specify whatever hdfs compatible 
 uris to the mesos-fetcher but maybe you can weight in on it. Do you 
 think this is the right track? if so, i would like to pick this issue 
 and submit a patch for review.
 
 -- Ankur
 
 
 On 1 Nov 2014, at 04:32, Tom Arnfeld t...@duedil.com 
 mailto:t...@duedil.com wrote:
 
 Completely +1 to this. There are now quite a lot of hadoop compatible 
 filesystem wrappers out in the wild and this would certainly be very 
 useful.
 
 I'm happy to contribute a patch. Here's a few related issues that might 
 be of interest;
 
 - https://issues.apache.org/jira/browse/MESOS-1887 
 https://issues.apache.org/jira/browse/MESOS-1887
 - https://issues.apache.org/jira/browse/MESOS-1316 
 https://issues.apache.org/jira/browse/MESOS-1316
 - https://issues.apache.org/jira/browse/MESOS-336 
 https://issues.apache.org/jira/browse/MESOS-336
 - https://issues.apache.org/jira/browse/MESOS-1248 
 https://issues.apache.org/jira/browse/MESOS-1248
 
 On 31 October 2014 22:39, Tim Chen t...@mesosphere.io 
 mailto:t...@mesosphere.io wrote:
 I believe there is already a JIRA ticket for this, if you search for 
 fetcher in Mesos JIRA I think you can find it.
 
 Tim
 
 On Fri, Oct 31, 2014 at 3:27 PM, Ankur Chauhan an...@malloc64.com 
 mailto:an...@malloc64.com wrote:
 Hi,
 
 I have been looking

Re: Why rely on url scheme for fetching?

2014-11-03 Thread Ankur Chauhan
Yea, I saw those today morning. I'll hold off a little mesos-336 changes a lot 
of stuff. 

Sent from my iPhone

 On Nov 3, 2014, at 9:18 AM, Adam Bordelon a...@mesosphere.io wrote:
 
 + Bernd, who has done some fetcher work, including additional testing, for 
 MESOS-1316, MESOS-1945, and MESOS-336
 
 On Mon, Nov 3, 2014 at 9:04 AM, Dominic Hamon dha...@twopensource.com 
 wrote:
 Hi Ankur
 
 I think this is a great approach. It makes the code much simpler, 
 extensible, and more testable. Anyone that's heard me rant knows I am a big 
 fan of unit tests over integration tests, so this shouldn't surprise anyone 
 :)
 
 If you haven't already, please read the documentation on contributing to 
 Mesos and the style guide to ensure all the naming is as expected, then you 
 can push the patch to reviewboard to get it reviewed and committed.
 
 On Mon, Nov 3, 2014 at 12:49 AM, Ankur Chauhan an...@malloc64.com wrote:
 Hi,
 
 I did some learning today! This is pretty much a very rough draft of the 
 tests/refactor of mesos-fetcher that I have come up with. Again, If there 
 are some obvious mistakes, please let me know. (this is my first pass after 
 all).
 https://github.com/ankurcha/mesos/compare/prefer_2
 
 My main intention is to break the logic of the fetcher info some very 
 discrete components that i can write tests against. I am still re-learning 
 cpp/mesos code styles etc so I may be a little slow to catch up but I would 
 really appreciate any comments and/or suggestions.
 
 -- Ankur
 @ankurcha
 
 On 2 Nov 2014, at 18:17, Ankur Chauhan an...@malloc64.com wrote:
 
 Hi,
 
 I noticed that the current set of tests in `src/tests/fetcher_tests.cpp` 
 is pretty coarse grained and are more on the lines of a functional test. I 
 was going to add some tests but it seems like if I am to do that I would 
 need to add a test dependency on hadoop. 
 
 As an alternative, I propose adding a good set of unit tests around the 
 methods used by `src/launcher/fetcher.cpp` and `src/hdfs/hdfs.cpp`. This 
 should be able to catch a good portion of cases at the same time keeping 
 the dependencies and runtime of tests low. What do you guys thing about 
 this?
 
 PS: I am pretty green in terms of gtest and the overall c++ testing 
 methodology. Can someone give me pointers to good examples of tests in the 
 codebase.
 
 -- Ankur
 
 On 1 Nov 2014, at 22:54, Adam Bordelon a...@mesosphere.io wrote:
 
 Thank you Ankur. At first glance, it looks great. We'll do a more 
 thorough review of it very soon.
 I know Tim St. Clair had some ideas for fixing MESOS-1711; he may want to 
 review too.
 
 On Sat, Nov 1, 2014 at 8:49 PM, Ankur Chauhan an...@malloc64.com wrote:
 Hi Tim,
 
 I just created a review https://reviews.apache.org/r/27483/ It's my 
 first stab at it and I will try to add more tests as I figure out how to 
 do the hadoop mocking and stuff. Have a look and let me know what you 
 think about it so far.
 
 -- Ankur
 
 On 1 Nov 2014, at 20:05, Ankur Chauhan an...@malloc64.com wrote:
 
 Yea, i saw that the minute i pressed send. I'll start the review board 
 so that people can have a look at the change.
 
 -- Ankur
 
 On 1 Nov 2014, at 20:01, Tim Chen t...@mesosphere.io wrote:
 
 Hi Ankur,
 
 There is a fetcher_tests.cpp in src/tests.
 
 Tim
 
 On Sat, Nov 1, 2014 at 7:27 PM, Ankur Chauhan an...@malloc64.com 
 wrote:
 Hi Tim,
 
 I am trying to find/write some test cases. I couldn't find a 
 fetcher_tests.{cpp|hpp} so once I have something, I'll post on review 
 board. I am new to gmock/gtest so bear with me while i get up to 
 speed.
 
 -- Ankur
 
 On 1 Nov 2014, at 19:23, Timothy Chen t...@mesosphere.io wrote:
 
 Hi Ankur,
 
 Can you post on reviewboard? We can discuss more about the code 
 there.
 
 Tim
 
 Sent from my iPhone
 
 On Nov 1, 2014, at 6:29 PM, Ankur Chauhan an...@malloc64.com 
 wrote:
 
 Hi Tim,
 
 I don't think there is an issue which is directly in line with what 
 i wanted but the closest one that I could find in JIRA is 
 https://issues.apache.org/jira/browse/MESOS-1711
 
 I have a branch ( 
 https://github.com/ankurcha/mesos/compare/prefer_hadoop_fetcher ) 
 that has a change that would enable users to specify whatever hdfs 
 compatible uris to the mesos-fetcher but maybe you can weight in on 
 it. Do you think this is the right track? if so, i would like to 
 pick this issue and submit a patch for review.
 
 -- Ankur
 
 
 On 1 Nov 2014, at 04:32, Tom Arnfeld t...@duedil.com wrote:
 
 Completely +1 to this. There are now quite a lot of hadoop 
 compatible filesystem wrappers out in the wild and this would 
 certainly be very useful.
 
 I'm happy to contribute a patch. Here's a few related issues that 
 might be of interest;
 
 - https://issues.apache.org/jira/browse/MESOS-1887
 - https://issues.apache.org/jira/browse/MESOS-1316
 - https://issues.apache.org/jira/browse/MESOS-336
 - https://issues.apache.org/jira/browse/MESOS-1248
 
 On 31 October 2014 22:39, Tim Chen t...@mesosphere.io wrote:
 I believe

Re: Why rely on url scheme for fetching?

2014-11-03 Thread Ankur Chauhan
Hi,

Okay, thanks Ian. What's the expected ETA on getting 0.21.0 out? jira didn't 
have a release date set.

-- Ankur

 On 3 Nov 2014, at 10:37, Ian Downes idow...@twitter.com.INVALID wrote:
 
 Unfortunately, this will not get in 0.21.0 as we're tagging that today.
 
 Please tag the ticket(s) as Target Version = 0.22.0.
 
 Ian
 
 On Mon, Nov 3, 2014 at 10:22 AM, Ankur Chauhan an...@malloc64.com 
 mailto:an...@malloc64.com wrote:
 
 Hi Tim/others,
 
 Is this to be included in the 0.21.0 release? If so, I don't know how to
 tag it etc. I would really (shamelessly) love it to be included as it would
 really simplify my intended usecase of using snackfs (cassandra backed
 filesystem).
 
 -- Ankur
 
 On 3 Nov 2014, at 09:28, Ankur Chauhan an...@malloc64.com wrote:
 
 Yea, I saw those today morning. I'll hold off a little mesos-336 changes
 a lot of stuff.
 
 Sent from my iPhone
 
 On Nov 3, 2014, at 9:18 AM, Adam Bordelon a...@mesosphere.io 
 mailto:a...@mesosphere.io mailto:
 a...@mesosphere.io mailto:a...@mesosphere.io wrote:
 
 + Bernd, who has done some fetcher work, including additional testing,
 for MESOS-1316, MESOS-1945, and MESOS-336
 
 On Mon, Nov 3, 2014 at 9:04 AM, Dominic Hamon dha...@twopensource.com 
 mailto:dha...@twopensource.com
 mailto:dha...@twopensource.com mailto:dha...@twopensource.com wrote:
 Hi Ankur
 
 I think this is a great approach. It makes the code much simpler,
 extensible, and more testable. Anyone that's heard me rant knows I am a big
 fan of unit tests over integration tests, so this shouldn't surprise anyone
 :)
 
 If you haven't already, please read the documentation on contributing
 to Mesos and the style guide to ensure all the naming is as expected, then
 you can push the patch to reviewboard to get it reviewed and committed.
 
 On Mon, Nov 3, 2014 at 12:49 AM, Ankur Chauhan an...@malloc64.com 
 mailto:an...@malloc64.com
 mailto:an...@malloc64.com mailto:an...@malloc64.com wrote:
 Hi,
 
 I did some learning today! This is pretty much a very rough draft of
 the tests/refactor of mesos-fetcher that I have come up with. Again, If
 there are some obvious mistakes, please let me know. (this is my first pass
 after all).
 https://github.com/ankurcha/mesos/compare/prefer_2 
 https://github.com/ankurcha/mesos/compare/prefer_2 
 https://github.com/ankurcha/mesos/compare/prefer_2 
 https://github.com/ankurcha/mesos/compare/prefer_2
 
 My main intention is to break the logic of the fetcher info some very
 discrete components that i can write tests against. I am still re-learning
 cpp/mesos code styles etc so I may be a little slow to catch up but I would
 really appreciate any comments and/or suggestions.
 
 -- Ankur
 @ankurcha
 
 On 2 Nov 2014, at 18:17, Ankur Chauhan an...@malloc64.com 
 mailto:an...@malloc64.com mailto:
 an...@malloc64.com mailto:an...@malloc64.com wrote:
 
 Hi,
 
 I noticed that the current set of tests in
 `src/tests/fetcher_tests.cpp` is pretty coarse grained and are more on the
 lines of a functional test. I was going to add some tests but it seems like
 if I am to do that I would need to add a test dependency on hadoop.
 
 As an alternative, I propose adding a good set of unit tests around
 the methods used by `src/launcher/fetcher.cpp` and `src/hdfs/hdfs.cpp`.
 This should be able to catch a good portion of cases at the same time
 keeping the dependencies and runtime of tests low. What do you guys thing
 about this?
 
 PS: I am pretty green in terms of gtest and the overall c++ testing
 methodology. Can someone give me pointers to good examples of tests in the
 codebase.
 
 -- Ankur
 
 On 1 Nov 2014, at 22:54, Adam Bordelon a...@mesosphere.io 
 mailto:a...@mesosphere.io mailto:
 a...@mesosphere.io mailto:a...@mesosphere.io wrote:
 
 Thank you Ankur. At first glance, it looks great. We'll do a more
 thorough review of it very soon.
 I know Tim St. Clair had some ideas for fixing MESOS-1711 
 https://issues.apache.org/jira/browse/MESOS-1711 
 https://issues.apache.org/jira/browse/MESOS-1711; he may want to review
 too.
 
 On Sat, Nov 1, 2014 at 8:49 PM, Ankur Chauhan an...@malloc64.com 
 mailto:an...@malloc64.com
 mailto:an...@malloc64.com mailto:an...@malloc64.com wrote:
 Hi Tim,
 
 I just created a review https://reviews.apache.org/r/27483/ 
 https://reviews.apache.org/r/27483/ 
 https://reviews.apache.org/r/27483/ https://reviews.apache.org/r/27483/ 
 It's my first stab at it and I will
 try to add more tests as I figure out how to do the hadoop mocking and
 stuff. Have a look and let me know what you think about it so far.
 
 -- Ankur
 
 On 1 Nov 2014, at 20:05, Ankur Chauhan an...@malloc64.com 
 mailto:an...@malloc64.com mailto:
 an...@malloc64.com mailto:an...@malloc64.com wrote:
 
 Yea, i saw that the minute i pressed send. I'll start the review
 board so that people can have a look at the change.
 
 -- Ankur
 
 On 1 Nov 2014, at 20:01, Tim Chen t...@mesosphere.io 
 mailto:t...@mesosphere.io mailto:
 t...@mesosphere.io mailto:t...@mesosphere.io wrote

  1   2   >