Re: Spark on Mesos 0.20

2014-10-06 Thread Timothy Chen
Hi Gurvinder,

I tried fine grain mode before and didn't get into that problem.


On Sun, Oct 5, 2014 at 11:44 PM, Gurvinder Singh
 wrote:
> On 10/06/2014 08:19 AM, Fairiz Azizi wrote:
>> The Spark online docs indicate that Spark is compatible with Mesos 0.18.1
>>
>> I've gotten it to work just fine on 0.18.1 and 0.18.2
>>
>> Has anyone tried Spark on a newer version of Mesos, i.e. Mesos v0.20.0?
>>
>> -Fi
>>
> Yeah we are using Spark 1.1.0 with Mesos 0.20.1. It runs fine in coarse
> mode, in fine grain mode there is an issue with blockmanager names
> conflict. I have been waiting for it to be fixed but it is still there.
>
> -Gurvinder
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
> For additional commands, e-mail: dev-h...@spark.apache.org
>

-
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org



Re: Spark on Mesos 0.20

2014-10-06 Thread Timothy Chen
(Hit enter too soon...)

What is your setup and steps to repro this?

Tim

On Mon, Oct 6, 2014 at 12:30 AM, Timothy Chen  wrote:
> Hi Gurvinder,
>
> I tried fine grain mode before and didn't get into that problem.
>
>
> On Sun, Oct 5, 2014 at 11:44 PM, Gurvinder Singh
>  wrote:
>> On 10/06/2014 08:19 AM, Fairiz Azizi wrote:
>>> The Spark online docs indicate that Spark is compatible with Mesos 0.18.1
>>>
>>> I've gotten it to work just fine on 0.18.1 and 0.18.2
>>>
>>> Has anyone tried Spark on a newer version of Mesos, i.e. Mesos v0.20.0?
>>>
>>> -Fi
>>>
>> Yeah we are using Spark 1.1.0 with Mesos 0.20.1. It runs fine in coarse
>> mode, in fine grain mode there is an issue with blockmanager names
>> conflict. I have been waiting for it to be fixed but it is still there.
>>
>> -Gurvinder
>>
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
>> For additional commands, e-mail: dev-h...@spark.apache.org
>>

-
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org



Re: Spark on Mesos 0.20

2014-10-06 Thread Gurvinder Singh
The issue does not occur if the task at hand has small number of map
tasks. I have a task which has 978 map tasks and I see this error as

14/10/06 09:34:40 ERROR BlockManagerMasterActor: Got two different block
manager registrations on 20140711-081617-711206558-5050-2543-5

Here is the log from the mesos-slave where this container was running.

http://pastebin.com/Q1Cuzm6Q

If you look for the code from where error produced by spark, you will
see that it simply exit and saying in comments "this should never
happen, lets just quit" :-)

- Gurvinder
On 10/06/2014 09:30 AM, Timothy Chen wrote:
> (Hit enter too soon...)
> 
> What is your setup and steps to repro this?
> 
> Tim
> 
> On Mon, Oct 6, 2014 at 12:30 AM, Timothy Chen  wrote:
>> Hi Gurvinder,
>>
>> I tried fine grain mode before and didn't get into that problem.
>>
>>
>> On Sun, Oct 5, 2014 at 11:44 PM, Gurvinder Singh
>>  wrote:
>>> On 10/06/2014 08:19 AM, Fairiz Azizi wrote:
 The Spark online docs indicate that Spark is compatible with Mesos 0.18.1

 I've gotten it to work just fine on 0.18.1 and 0.18.2

 Has anyone tried Spark on a newer version of Mesos, i.e. Mesos v0.20.0?

 -Fi

>>> Yeah we are using Spark 1.1.0 with Mesos 0.20.1. It runs fine in coarse
>>> mode, in fine grain mode there is an issue with blockmanager names
>>> conflict. I have been waiting for it to be fixed but it is still there.
>>>
>>> -Gurvinder
>>>
>>> -
>>> To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
>>> For additional commands, e-mail: dev-h...@spark.apache.org
>>>


-
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org



TorrentBroadcast slow performance

2014-10-06 Thread Guillaume Pitel

Hi,

I've had no answer to this on u...@spark.apache.org, so I post it on dev before 
filing a JIRA (in case the problem or solution is already identified)


We've had some performance issues since switching to 1.1.0, and we finally found 
the origin : TorrentBroadcast seems to be very slow in our setting (and it 
became default with 1.1.0)


The logs of a 4MB variable with TorrentBroadcast : (15s)

14/10/01 15:47:13 INFO storage.MemoryStore: Block broadcast_84_piece1 stored as 
bytes in memory (estimated size 171.6 KB, free 7.2 GB)
14/10/01 15:47:13 INFO storage.BlockManagerMaster: Updated info of block 
broadcast_84_piece1
14/10/01 15:47:23 INFO storage.MemoryStore: ensureFreeSpace(4194304) called with 
curMem=1401611984, maxMem=9168696115
14/10/01 15:47:23 INFO storage.MemoryStore: Block broadcast_84_piece0 stored as 
bytes in memory (estimated size 4.0 MB, free 7.2 GB)
14/10/01 15:47:23 INFO storage.BlockManagerMaster: Updated info of block 
broadcast_84_piece0
14/10/01 15:47:23 INFO broadcast.TorrentBroadcast: Reading broadcast variable 84 
took 15.202260006 s
14/10/01 15:47:23 INFO storage.MemoryStore: ensureFreeSpace(4371392) called with 
curMem=1405806288, maxMem=9168696115
14/10/01 15:47:23 INFO storage.MemoryStore: Block broadcast_84 stored as values 
in memory (estimated size 4.2 MB, free 7.2 GB)


(notice that a 10s lag happens after the "Updated info of block broadcast_..." 
and before the MemoryStore log


And with HttpBroadcast (0.3s):

14/10/01 16:05:58 INFO broadcast.HttpBroadcast: Started reading broadcast 
variable 147
14/10/01 16:05:58 INFO storage.MemoryStore: ensureFreeSpace(4369376) called with 
curMem=1373493232, maxMem=9168696115
14/10/01 16:05:58 INFO storage.MemoryStore: Block broadcast_147 stored as values 
in memory (estimated size 4.2 MB, free 7.3 GB)
14/10/01 16:05:58 INFO broadcast.HttpBroadcast: Reading broadcast variable 147 
took 0.320907112 s 14/10/01 16:05:58 INFO storage.BlockManager: Found block 
broadcast_147 locally


Since Torrent is supposed to perform much better than Http, we suspect a 
configuration error from our side, but are unable to pin it down. Does someone 
have any idea of the origin of the problem ?


For now we're sticking with the HttpBroadcast workaround.

Guillaume
--
eXenSa


*Guillaume PITEL, Président*
+33(0)626 222 431

eXenSa S.A.S. 
41, rue Périer - 92120 Montrouge - FRANCE
Tel +33(0)184 163 677 / Fax +33(0)972 283 705



Re: Spark on Mesos 0.20

2014-10-06 Thread RJ Nowling
I've recently run into this issue as well. I get it from running Spark
examples such as log query.  Maybe that'll help reproduce the issue.

On Monday, October 6, 2014, Gurvinder Singh 
wrote:

> The issue does not occur if the task at hand has small number of map
> tasks. I have a task which has 978 map tasks and I see this error as
>
> 14/10/06 09:34:40 ERROR BlockManagerMasterActor: Got two different block
> manager registrations on 20140711-081617-711206558-5050-2543-5
>
> Here is the log from the mesos-slave where this container was running.
>
> http://pastebin.com/Q1Cuzm6Q
>
> If you look for the code from where error produced by spark, you will
> see that it simply exit and saying in comments "this should never
> happen, lets just quit" :-)
>
> - Gurvinder
> On 10/06/2014 09:30 AM, Timothy Chen wrote:
> > (Hit enter too soon...)
> >
> > What is your setup and steps to repro this?
> >
> > Tim
> >
> > On Mon, Oct 6, 2014 at 12:30 AM, Timothy Chen  > wrote:
> >> Hi Gurvinder,
> >>
> >> I tried fine grain mode before and didn't get into that problem.
> >>
> >>
> >> On Sun, Oct 5, 2014 at 11:44 PM, Gurvinder Singh
> >> > wrote:
> >>> On 10/06/2014 08:19 AM, Fairiz Azizi wrote:
>  The Spark online docs indicate that Spark is compatible with Mesos
> 0.18.1
> 
>  I've gotten it to work just fine on 0.18.1 and 0.18.2
> 
>  Has anyone tried Spark on a newer version of Mesos, i.e. Mesos
> v0.20.0?
> 
>  -Fi
> 
> >>> Yeah we are using Spark 1.1.0 with Mesos 0.20.1. It runs fine in coarse
> >>> mode, in fine grain mode there is an issue with blockmanager names
> >>> conflict. I have been waiting for it to be fixed but it is still there.
> >>>
> >>> -Gurvinder
> >>>
> >>> -
> >>> To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
> 
> >>> For additional commands, e-mail: dev-h...@spark.apache.org
> 
> >>>
>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org 
> For additional commands, e-mail: dev-h...@spark.apache.org 
>
>

-- 
em rnowl...@gmail.com
c 954.496.2314


Re: Parquet schema migrations

2014-10-06 Thread Cody Koeninger
Sorry, by "raw parquet" I just meant there is no external metadata store,
only the schema written as part of the parquet format.

We've done several different kinds of changes, including column rename and
widening the data type of an existing column.  I don't think it's feasible
to support those.

The kind of change we've made that it probably makes most sense to support
is adding a nullable column. I think that also implies supporting
"removing" a nullable column, as long as you don't end up with columns of
the same name but different type.

I'm not sure semantically that it makes sense to do schema merging as part
of union all, and definitely doesn't make sense to do it by default.  I
wouldn't want two accidentally compatible schema to get merged without
warning.  It's also a little odd since unlike a normal sql database union
all can happen before there are any projections or filters... e.g. what
order do columns come back in if someone does select *.

Seems like there should be either a separate api call, or an optional
argument to union all.

As far as resources go, I can probably put some personal time into this if
we come up with a plan that makes sense.


On Sun, Oct 5, 2014 at 7:36 PM, Michael Armbrust 
wrote:

> Hi Cody,
>
> Assuming you are talking about 'safe' changes to the schema (i.e. existing
> column names are never reused with incompatible types), this is something
> I'd love to support.  Perhaps you can describe more what sorts of changes
> you are making, and if simple merging of the schemas would be sufficient.
> If so, we can open a JIRA, though I'm not sure when we'll have resources to
> dedicate to this.
>
> In the near term, I'd suggest writing converters for each version of the
> schema, that translate to some desired master schema.  You can then union
> all of these together and avoid the cost of batch conversion.  It seems
> like in most cases this should be pretty efficient, at least now that we
> have good pushdown past union operators :)
>
> Michael
>
> On Sun, Oct 5, 2014 at 3:58 PM, Andrew Ash  wrote:
>
>> Hi Cody,
>>
>> I wasn't aware there were different versions of the parquet format.
>> What's
>> the difference between "raw parquet" and the Hive-written parquet files?
>>
>> As for your migration question, the approaches I've often seen are
>> convert-on-read and convert-all-at-once.  Apache Cassandra for example
>> does
>> both -- when upgrading between Cassandra versions that change the on-disk
>> sstable format, it will do a convert-on-read as you access the sstables,
>> or
>> you can run the upgradesstables command to convert them all at once
>> post-upgrade.
>>
>> Andrew
>>
>> On Fri, Oct 3, 2014 at 4:33 PM, Cody Koeninger 
>> wrote:
>>
>> > Wondering if anyone has thoughts on a path forward for parquet schema
>> > migrations, especially for people (like us) that are using raw parquet
>> > files rather than Hive.
>> >
>> > So far we've gotten away with reading old files, converting, and
>> writing to
>> > new directories, but that obviously becomes problematic above a certain
>> > data size.
>> >
>>
>
>


Re: Spark on Mesos 0.20

2014-10-06 Thread Timothy Chen
Ok I created SPARK-3817 to track this, will try to repro it as well.

Tim

On Mon, Oct 6, 2014 at 6:08 AM, RJ Nowling  wrote:
> I've recently run into this issue as well. I get it from running Spark
> examples such as log query.  Maybe that'll help reproduce the issue.
>
>
> On Monday, October 6, 2014, Gurvinder Singh 
> wrote:
>>
>> The issue does not occur if the task at hand has small number of map
>> tasks. I have a task which has 978 map tasks and I see this error as
>>
>> 14/10/06 09:34:40 ERROR BlockManagerMasterActor: Got two different block
>> manager registrations on 20140711-081617-711206558-5050-2543-5
>>
>> Here is the log from the mesos-slave where this container was running.
>>
>> http://pastebin.com/Q1Cuzm6Q
>>
>> If you look for the code from where error produced by spark, you will
>> see that it simply exit and saying in comments "this should never
>> happen, lets just quit" :-)
>>
>> - Gurvinder
>> On 10/06/2014 09:30 AM, Timothy Chen wrote:
>> > (Hit enter too soon...)
>> >
>> > What is your setup and steps to repro this?
>> >
>> > Tim
>> >
>> > On Mon, Oct 6, 2014 at 12:30 AM, Timothy Chen  wrote:
>> >> Hi Gurvinder,
>> >>
>> >> I tried fine grain mode before and didn't get into that problem.
>> >>
>> >>
>> >> On Sun, Oct 5, 2014 at 11:44 PM, Gurvinder Singh
>> >>  wrote:
>> >>> On 10/06/2014 08:19 AM, Fairiz Azizi wrote:
>>  The Spark online docs indicate that Spark is compatible with Mesos
>>  0.18.1
>> 
>>  I've gotten it to work just fine on 0.18.1 and 0.18.2
>> 
>>  Has anyone tried Spark on a newer version of Mesos, i.e. Mesos
>>  v0.20.0?
>> 
>>  -Fi
>> 
>> >>> Yeah we are using Spark 1.1.0 with Mesos 0.20.1. It runs fine in
>> >>> coarse
>> >>> mode, in fine grain mode there is an issue with blockmanager names
>> >>> conflict. I have been waiting for it to be fixed but it is still
>> >>> there.
>> >>>
>> >>> -Gurvinder
>> >>>
>> >>> -
>> >>> To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
>> >>> For additional commands, e-mail: dev-h...@spark.apache.org
>> >>>
>>
>>
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
>> For additional commands, e-mail: dev-h...@spark.apache.org
>>
>
>
> --
> em rnowl...@gmail.com
> c 954.496.2314

-
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org



Re: EC2 clusters ready in launch time + 30 seconds

2014-10-06 Thread Daniil Osipov
I've also been looking at this. Basically, the Spark EC2 script is
excellent for small development clusters of several nodes, but isn't
suitable for production. It handles instance setup in a single threaded
manner, while it can easily be parallelized. It also doesn't handle failure
well, ex when an instance fails to start or is taking too long to respond.

Our desire was to have an equivalent to Amazon EMR[1] API that would
trigger Spark jobs, including specified cluster setup. I've done some work
towards that end, and it would benefit from an updated AMI greatly.

Dan

[1]
http://docs.aws.amazon.com/ElasticMapReduce/latest/DeveloperGuide/emr-cli-commands.html

On Sat, Oct 4, 2014 at 7:28 AM, Nicholas Chammas  wrote:

> Thanks for posting that script, Patrick. It looks like a good place to
> start.
>
> Regarding Docker vs. Packer, as I understand it you can use Packer to
> create Docker containers at the same time as AMIs and other image types.
>
> Nick
>
>
> On Sat, Oct 4, 2014 at 2:49 AM, Patrick Wendell 
> wrote:
>
> > Hey All,
> >
> > Just a couple notes. I recently posted a shell script for creating the
> > AMI's from a clean Amazon Linux AMI.
> >
> > https://github.com/mesos/spark-ec2/blob/v3/create_image.sh
> >
> > I think I will update the AMI's soon to get the most recent security
> > updates. For spark-ec2's purpose this is probably sufficient (we'll
> > only need to re-create them every few months).
> >
> > However, it would be cool if someone wanted to tackle providing a more
> > general mechanism for defining Spark-friendly "images" that can be
> > used more generally. I had thought that docker might be a good way to
> > go for something like this - but maybe this packer thing is good too.
> >
> > For one thing, if we had a standard image we could use it to create
> > containers for running Spark's unit test, which would be really cool.
> > This would help a lot with random issues around port and filesystem
> > contention we have for unit tests.
> >
> > I'm not sure if the long term place for this would be inside the spark
> > codebase or a community library or what. But it would definitely be
> > very valuable to have if someone wanted to take it on.
> >
> > - Patrick
> >
> > On Fri, Oct 3, 2014 at 5:20 PM, Nicholas Chammas
> >  wrote:
> > > FYI: There is an existing issue -- SPARK-3314
> > >  -- about scripting
> > the
> > > creation of Spark AMIs.
> > >
> > > With Packer, it looks like we may be able to script the creation of
> > > multiple image types (VMWare, GCE, AMI, Docker, etc...) at once from a
> > > single Packer template. That's very cool.
> > >
> > > I'll be looking into this.
> > >
> > > Nick
> > >
> > >
> > > On Thu, Oct 2, 2014 at 8:23 PM, Nicholas Chammas <
> > nicholas.cham...@gmail.com
> > >> wrote:
> > >
> > >> Thanks for the update, Nate. I'm looking forward to seeing how these
> > >> projects turn out.
> > >>
> > >> David, Packer looks very, very interesting. I'm gonna look into it
> more
> > >> next week.
> > >>
> > >> Nick
> > >>
> > >>
> > >> On Thu, Oct 2, 2014 at 8:00 PM, Nate D'Amico 
> wrote:
> > >>
> > >>> Bit of progress on our end, bit of lagging as well.  Our guy leading
> > >>> effort got little bogged down on client project to update hive/sql
> > testbed
> > >>> to latest spark/sparkSQL, also launching public service so we have
> > been bit
> > >>> scattered recently.
> > >>>
> > >>> Will have some more updates probably after next week.  We are
> planning
> > on
> > >>> taking our client work around hive/spark, plus taking over the bigtop
> > >>> automation work to modernize and get that fit for human consumption
> > outside
> > >>> or org.  All our work and puppet modules will be open sourced,
> > documented,
> > >>> hopefully start to rally some other folks around effort that find it
> > useful
> > >>>
> > >>> Side note, another effort we are looking into is gradle
> tests/support.
> > >>> We have been leveraging serverspec for some basic infrastructure
> > tests, but
> > >>> with bigtop switching over to gradle builds/testing setup in 0.8 we
> > want to
> > >>> include support for that in our own efforts, probably some stuff that
> > can
> > >>> be learned and leveraged in spark world for repeatable/tested
> > infrastructure
> > >>>
> > >>> If anyone has any specific automation questions to your environment
> you
> > >>> can drop me a line directly.., will try to help out best I can.  Else
> > will
> > >>> post update to dev list once we get on top of our own product release
> > and
> > >>> the bigtop work
> > >>>
> > >>> Nate
> > >>>
> > >>>
> > >>> -Original Message-
> > >>> From: David Rowe [mailto:davidr...@gmail.com]
> > >>> Sent: Thursday, October 02, 2014 4:44 PM
> > >>> To: Nicholas Chammas
> > >>> Cc: dev; Shivaram Venkataraman
> > >>> Subject: Re: EC2 clusters ready in launch time + 30 seconds
> > >>>
> > >>> I think this is exactly what packer is for. See e.g.
> > >>> http://www.packer.i

Re: EC2 clusters ready in launch time + 30 seconds

2014-10-06 Thread Nicholas Chammas
FYI: I've created SPARK-3821: Develop an automated way of creating Spark
images (AMI, Docker, and others)


On Mon, Oct 6, 2014 at 4:48 PM, Daniil Osipov 
wrote:

> I've also been looking at this. Basically, the Spark EC2 script is
> excellent for small development clusters of several nodes, but isn't
> suitable for production. It handles instance setup in a single threaded
> manner, while it can easily be parallelized. It also doesn't handle failure
> well, ex when an instance fails to start or is taking too long to respond.
>
> Our desire was to have an equivalent to Amazon EMR[1] API that would
> trigger Spark jobs, including specified cluster setup. I've done some work
> towards that end, and it would benefit from an updated AMI greatly.
>
> Dan
>
> [1]
> http://docs.aws.amazon.com/ElasticMapReduce/latest/DeveloperGuide/emr-cli-commands.html
>
> On Sat, Oct 4, 2014 at 7:28 AM, Nicholas Chammas <
> nicholas.cham...@gmail.com> wrote:
>
>> Thanks for posting that script, Patrick. It looks like a good place to
>> start.
>>
>> Regarding Docker vs. Packer, as I understand it you can use Packer to
>> create Docker containers at the same time as AMIs and other image types.
>>
>> Nick
>>
>>
>> On Sat, Oct 4, 2014 at 2:49 AM, Patrick Wendell 
>> wrote:
>>
>> > Hey All,
>> >
>> > Just a couple notes. I recently posted a shell script for creating the
>> > AMI's from a clean Amazon Linux AMI.
>> >
>> > https://github.com/mesos/spark-ec2/blob/v3/create_image.sh
>> >
>> > I think I will update the AMI's soon to get the most recent security
>> > updates. For spark-ec2's purpose this is probably sufficient (we'll
>> > only need to re-create them every few months).
>> >
>> > However, it would be cool if someone wanted to tackle providing a more
>> > general mechanism for defining Spark-friendly "images" that can be
>> > used more generally. I had thought that docker might be a good way to
>> > go for something like this - but maybe this packer thing is good too.
>> >
>> > For one thing, if we had a standard image we could use it to create
>> > containers for running Spark's unit test, which would be really cool.
>> > This would help a lot with random issues around port and filesystem
>> > contention we have for unit tests.
>> >
>> > I'm not sure if the long term place for this would be inside the spark
>> > codebase or a community library or what. But it would definitely be
>> > very valuable to have if someone wanted to take it on.
>> >
>> > - Patrick
>> >
>> > On Fri, Oct 3, 2014 at 5:20 PM, Nicholas Chammas
>> >  wrote:
>> > > FYI: There is an existing issue -- SPARK-3314
>> > >  -- about scripting
>> > the
>> > > creation of Spark AMIs.
>> > >
>> > > With Packer, it looks like we may be able to script the creation of
>> > > multiple image types (VMWare, GCE, AMI, Docker, etc...) at once from a
>> > > single Packer template. That's very cool.
>> > >
>> > > I'll be looking into this.
>> > >
>> > > Nick
>> > >
>> > >
>> > > On Thu, Oct 2, 2014 at 8:23 PM, Nicholas Chammas <
>> > nicholas.cham...@gmail.com
>> > >> wrote:
>> > >
>> > >> Thanks for the update, Nate. I'm looking forward to seeing how these
>> > >> projects turn out.
>> > >>
>> > >> David, Packer looks very, very interesting. I'm gonna look into it
>> more
>> > >> next week.
>> > >>
>> > >> Nick
>> > >>
>> > >>
>> > >> On Thu, Oct 2, 2014 at 8:00 PM, Nate D'Amico 
>> wrote:
>> > >>
>> > >>> Bit of progress on our end, bit of lagging as well.  Our guy leading
>> > >>> effort got little bogged down on client project to update hive/sql
>> > testbed
>> > >>> to latest spark/sparkSQL, also launching public service so we have
>> > been bit
>> > >>> scattered recently.
>> > >>>
>> > >>> Will have some more updates probably after next week.  We are
>> planning
>> > on
>> > >>> taking our client work around hive/spark, plus taking over the
>> bigtop
>> > >>> automation work to modernize and get that fit for human consumption
>> > outside
>> > >>> or org.  All our work and puppet modules will be open sourced,
>> > documented,
>> > >>> hopefully start to rally some other folks around effort that find it
>> > useful
>> > >>>
>> > >>> Side note, another effort we are looking into is gradle
>> tests/support.
>> > >>> We have been leveraging serverspec for some basic infrastructure
>> > tests, but
>> > >>> with bigtop switching over to gradle builds/testing setup in 0.8 we
>> > want to
>> > >>> include support for that in our own efforts, probably some stuff
>> that
>> > can
>> > >>> be learned and leveraged in spark world for repeatable/tested
>> > infrastructure
>> > >>>
>> > >>> If anyone has any specific automation questions to your environment
>> you
>> > >>> can drop me a line directly.., will try to help out best I can.
>> Else
>> > will
>> > >>> post update to dev list once we get on top of our own product
>> release
>> > and
>> > >>> the bigtop work
>> > >>>
>> > >

Re: EC2 clusters ready in launch time + 30 seconds

2014-10-06 Thread David Rowe
I agree with this - there is also the issue of different sized masters and
slaves, and numbers of executors for hefty machines (e.g. r3.8xlarges),
tagging of instances and volumes (we use this for cost attribution at my
workplace), and running in VPCs.

I think think it might be useful to take a layered approach: the first step
could be getting a good reliable image produced - Nick's ticket - then
doing some work on the launch script.

Regarding the EMR like service - I think I heard that AWS is planning to
add spark support to EMR, but as usual there's nothing firm until it's
released.


On Tue, Oct 7, 2014 at 7:48 AM, Daniil Osipov 
wrote:

> I've also been looking at this. Basically, the Spark EC2 script is
> excellent for small development clusters of several nodes, but isn't
> suitable for production. It handles instance setup in a single threaded
> manner, while it can easily be parallelized. It also doesn't handle failure
> well, ex when an instance fails to start or is taking too long to respond.
>
> Our desire was to have an equivalent to Amazon EMR[1] API that would
> trigger Spark jobs, including specified cluster setup. I've done some work
> towards that end, and it would benefit from an updated AMI greatly.
>
> Dan
>
> [1]
>
> http://docs.aws.amazon.com/ElasticMapReduce/latest/DeveloperGuide/emr-cli-commands.html
>
> On Sat, Oct 4, 2014 at 7:28 AM, Nicholas Chammas <
> nicholas.cham...@gmail.com
> > wrote:
>
> > Thanks for posting that script, Patrick. It looks like a good place to
> > start.
> >
> > Regarding Docker vs. Packer, as I understand it you can use Packer to
> > create Docker containers at the same time as AMIs and other image types.
> >
> > Nick
> >
> >
> > On Sat, Oct 4, 2014 at 2:49 AM, Patrick Wendell 
> > wrote:
> >
> > > Hey All,
> > >
> > > Just a couple notes. I recently posted a shell script for creating the
> > > AMI's from a clean Amazon Linux AMI.
> > >
> > > https://github.com/mesos/spark-ec2/blob/v3/create_image.sh
> > >
> > > I think I will update the AMI's soon to get the most recent security
> > > updates. For spark-ec2's purpose this is probably sufficient (we'll
> > > only need to re-create them every few months).
> > >
> > > However, it would be cool if someone wanted to tackle providing a more
> > > general mechanism for defining Spark-friendly "images" that can be
> > > used more generally. I had thought that docker might be a good way to
> > > go for something like this - but maybe this packer thing is good too.
> > >
> > > For one thing, if we had a standard image we could use it to create
> > > containers for running Spark's unit test, which would be really cool.
> > > This would help a lot with random issues around port and filesystem
> > > contention we have for unit tests.
> > >
> > > I'm not sure if the long term place for this would be inside the spark
> > > codebase or a community library or what. But it would definitely be
> > > very valuable to have if someone wanted to take it on.
> > >
> > > - Patrick
> > >
> > > On Fri, Oct 3, 2014 at 5:20 PM, Nicholas Chammas
> > >  wrote:
> > > > FYI: There is an existing issue -- SPARK-3314
> > > >  -- about
> scripting
> > > the
> > > > creation of Spark AMIs.
> > > >
> > > > With Packer, it looks like we may be able to script the creation of
> > > > multiple image types (VMWare, GCE, AMI, Docker, etc...) at once from
> a
> > > > single Packer template. That's very cool.
> > > >
> > > > I'll be looking into this.
> > > >
> > > > Nick
> > > >
> > > >
> > > > On Thu, Oct 2, 2014 at 8:23 PM, Nicholas Chammas <
> > > nicholas.cham...@gmail.com
> > > >> wrote:
> > > >
> > > >> Thanks for the update, Nate. I'm looking forward to seeing how these
> > > >> projects turn out.
> > > >>
> > > >> David, Packer looks very, very interesting. I'm gonna look into it
> > more
> > > >> next week.
> > > >>
> > > >> Nick
> > > >>
> > > >>
> > > >> On Thu, Oct 2, 2014 at 8:00 PM, Nate D'Amico 
> > wrote:
> > > >>
> > > >>> Bit of progress on our end, bit of lagging as well.  Our guy
> leading
> > > >>> effort got little bogged down on client project to update hive/sql
> > > testbed
> > > >>> to latest spark/sparkSQL, also launching public service so we have
> > > been bit
> > > >>> scattered recently.
> > > >>>
> > > >>> Will have some more updates probably after next week.  We are
> > planning
> > > on
> > > >>> taking our client work around hive/spark, plus taking over the
> bigtop
> > > >>> automation work to modernize and get that fit for human consumption
> > > outside
> > > >>> or org.  All our work and puppet modules will be open sourced,
> > > documented,
> > > >>> hopefully start to rally some other folks around effort that find
> it
> > > useful
> > > >>>
> > > >>> Side note, another effort we are looking into is gradle
> > tests/support.
> > > >>> We have been leveraging serverspec for some basic infrastructure
> > > tests, but
> > > >>> with bigtop switching over to gra

Re: Spark on Mesos 0.20

2014-10-06 Thread Fairiz Azizi
That's what great about Spark, the community is so active! :)

I compiled Mesos 0.20.1 from the source tarball.

Using the Mapr3 Spark 1.1.0 distribution from the Spark downloads page
 (spark-1.1.0-bin-mapr3.tgz).

I see no problems for the workloads we are trying.

However, the cluster is small (less than 100 cores across 3 nodes).

The workloads reads in just a few gigabytes from HDFS, via an ipython
notebook spark shell.

thanks,
Fi



Fairiz "Fi" Azizi

On Mon, Oct 6, 2014 at 9:20 AM, Timothy Chen  wrote:

> Ok I created SPARK-3817 to track this, will try to repro it as well.
>
> Tim
>
> On Mon, Oct 6, 2014 at 6:08 AM, RJ Nowling  wrote:
> > I've recently run into this issue as well. I get it from running Spark
> > examples such as log query.  Maybe that'll help reproduce the issue.
> >
> >
> > On Monday, October 6, 2014, Gurvinder Singh 
> > wrote:
> >>
> >> The issue does not occur if the task at hand has small number of map
> >> tasks. I have a task which has 978 map tasks and I see this error as
> >>
> >> 14/10/06 09:34:40 ERROR BlockManagerMasterActor: Got two different block
> >> manager registrations on 20140711-081617-711206558-5050-2543-5
> >>
> >> Here is the log from the mesos-slave where this container was running.
> >>
> >> http://pastebin.com/Q1Cuzm6Q
> >>
> >> If you look for the code from where error produced by spark, you will
> >> see that it simply exit and saying in comments "this should never
> >> happen, lets just quit" :-)
> >>
> >> - Gurvinder
> >> On 10/06/2014 09:30 AM, Timothy Chen wrote:
> >> > (Hit enter too soon...)
> >> >
> >> > What is your setup and steps to repro this?
> >> >
> >> > Tim
> >> >
> >> > On Mon, Oct 6, 2014 at 12:30 AM, Timothy Chen 
> wrote:
> >> >> Hi Gurvinder,
> >> >>
> >> >> I tried fine grain mode before and didn't get into that problem.
> >> >>
> >> >>
> >> >> On Sun, Oct 5, 2014 at 11:44 PM, Gurvinder Singh
> >> >>  wrote:
> >> >>> On 10/06/2014 08:19 AM, Fairiz Azizi wrote:
> >>  The Spark online docs indicate that Spark is compatible with Mesos
> >>  0.18.1
> >> 
> >>  I've gotten it to work just fine on 0.18.1 and 0.18.2
> >> 
> >>  Has anyone tried Spark on a newer version of Mesos, i.e. Mesos
> >>  v0.20.0?
> >> 
> >>  -Fi
> >> 
> >> >>> Yeah we are using Spark 1.1.0 with Mesos 0.20.1. It runs fine in
> >> >>> coarse
> >> >>> mode, in fine grain mode there is an issue with blockmanager names
> >> >>> conflict. I have been waiting for it to be fixed but it is still
> >> >>> there.
> >> >>>
> >> >>> -Gurvinder
> >> >>>
> >> >>>
> -
> >> >>> To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
> >> >>> For additional commands, e-mail: dev-h...@spark.apache.org
> >> >>>
> >>
> >>
> >> -
> >> To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
> >> For additional commands, e-mail: dev-h...@spark.apache.org
> >>
> >
> >
> > --
> > em rnowl...@gmail.com
> > c 954.496.2314
>


Re: What is the best way to build my developing Spark for testing on EC2?

2014-10-06 Thread Yu Ishikawa
Hi Evan,

Sorry for my replay late. And Thank you for your comment.

> As far as cluster set up goes, I usually launch spot instances with the
> spark-ec2 scripts, 
> and then check out a repo which contains a simple driver application for
> my code. 
> Then I have something crude like bash scripts running my program and
> collecting output. 

It's just as you thought.  I agree with you.

> You could have a look at the spark-perf repo if you want something a
> little better principled/automatic. 

I overlooked this. I will give it a try.

best,



-
-- Yu Ishikawa
--
View this message in context: 
http://apache-spark-developers-list.1001551.n3.nabble.com/What-is-the-best-way-to-build-my-developing-Spark-for-testing-on-EC2-tp8638p8677.html
Sent from the Apache Spark Developers List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org



Pull Requests

2014-10-06 Thread Bill Bejeck
Once a PR has been tested and verified, when does it get pulled back into
the trunk?


Re: Pull Requests

2014-10-06 Thread Bill Bejeck
Can someone review patch #2309 (jira task SPARK-3178)

Thanks

On Mon, Oct 6, 2014 at 10:41 PM, Patrick Wendell  wrote:

> Hey Bill,
>
> Automated testing is just one small part of the process that performs
> basic sanity checks on code. All patches need to be championed and
> merged by a committer to make it into Spark. For large patches we also
> ask users to propose a design before sending a patch.
>
> This is discussed in our contributing page:
> https://cwiki.apache.org/confluence/display/SPARK/Contributing+to+Spark
>
> If there is a patch that you are waiting for feedback on feel free to
> just ping this list with the patch number. Is there one you are
> waiting for feedback on?
>
> - Patrick
>
> On Mon, Oct 6, 2014 at 7:32 PM, Bill Bejeck  wrote:
> > Once a PR has been tested and verified, when does it get pulled back into
> > the trunk?
>


Re: Hyper Parameter Tuning Algorithms

2014-10-06 Thread Ameet Talwalkar
Hi Lochana,

This post is also referring to the MLbase project I mentioned in my
previous email.  We have not open-sourced this work, but plan to do so.

Moreover, you might want to check out the following JIRA ticket
that includes the design
doc for ML pipelines and parameters in MLlib.  This design will include
many of the ideas from our MLbase work.

-Ameet

On Sun, Oct 5, 2014 at 7:28 PM, Lochana Menikarachchi 
wrote:

> Found this thread from April..
>
> http://mail-archives.apache.org/mod_mbox/spark-user/201404.mbox/%
> 3ccabjxkq6b7sfaxie4+aqtcmd8jsqbznsxsfw6v5o0wwwouob...@mail.gmail.com%3E
>
> Wondering what the status of this.. We are thinking about implementing
> these algorithms.. Would be a waste if they are already available?
>
> Please advice.
>
> Thanks.
>
> Lochana
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
> For additional commands, e-mail: dev-h...@spark.apache.org
>
>