ror;
>>>> at org.apache.spark.ml.recommendation.ALS.fit(ALS.scala:452)
>>>> at yelp.TestUser.main(TestUser.java:101)
>>>>
>>>> here line 101 in the above error is the following in code.
>>>>
>>>> ALSModel model = als.fit(training)
, Jun 18, 2016 at 6:03 AM, Jacek Laskowski wrote:
> On Sat, Jun 18, 2016 at 6:13 AM, Pedro Rodriguez
> wrote:
>
> > using Datasets (eg using $ to select columns).
>
> Or even my favourite one - the tick ` :-)
>
> Jacek
>
--
Pedro Rodriguez
PhD Student in Distri
an add more contents
> incrementally.
>
> We should definitely cover more about Dataset.
>
>
> Cheng
>
> On 6/17/16 10:28 PM, Pedro Rodriguez wrote:
>
> The updates look great!
>
> Looks like many places are updated to the new APIs, but there still isn't
> a s
; Cheng
> On 6/17/16 9:13 PM, Pedro Rodriguez wrote:
>
> Hi All,
>
> At my workplace we are starting to use Datasets in 1.6.1 and even more
> with Spark 2.0 in place of Dataframes. I looked at the 1.6.1 documentation
> then the 2.0 documentation and it looks like not much time
hat if the data was skewed while joining it would take long time
> to finish the job.(99 percent finished in seconds where 1 percent of task
> taking minutes to hour).
>
> How to handle skewed data in spark.
>
> Thanks,
> Selvam R
> +91-97877-87724
>
--
Pedro Rodriguez
P
creating and using Datasets (eg using $ to
select columns). Is this of value, and if so what should my next step be to
get this going (create JIRA etc)?
--
Pedro Rodriguez
PhD Student in Distributed Machine Learning | CU Boulder
R&D Data Science Intern at Oracle Data Cloud
UC Berkeley AM
be helpful to know what to
look for or if its better to ask library maintainers directly.
Thanks,
Pedro Rodriguez
On Fri, Jun 17, 2016 at 10:46 AM, Xinh Huynh wrote:
> Here are some guidelines about contributing to Spark:
>
> https://cwiki.apache.org/confluence/display/SPARK/Contri
>
> https://issues.apache.org/jira/issues/?filter=12333428
>
> For a specific release, you can also filter the release, and I Reynold had
> sent this a few days ago for 1.5.1
>
> https://issues.apache.org/jira/issues/?filter=1221
>
>
> On Tue, Sep 22, 2015 at 8:50 A
ls for
the next release (be it 1.5.1 or 1.6) with some parent issues along with
smaller child issues to work on (like the built ins ticket from 1.5)?
Thanks,
--
Pedro Rodriguez
PhD Student in Distributed Machine Learning | CU Boulder
UC Berkeley AMPLab Alumni
ski.rodrig...@gmail.com | pedrorodrigue
g the web UI; Spree currently involves two JS servers so
> some rewriting of things would probably have to happen), why it might be
> good to do, and why it might be not good or not worth it (e.g. Spark should
> make sure it’s possible and easy to do sophisticated things like this
> outside of the Spark repo, putting more work in the driver process is a bad
> idea, etc.).
>
> OK, that’s my brain dump, I’d love to hear peoples’ thoughts on any/all of
> this, otherwise thanks for the APIs and sorry for having to cheat them a
> bit! :)
>
> -Ryan
>
>
--
Pedro Rodriguez
UCBerkeley 2014 | Computer Science
SnowGeek <http://SnowGeek.org>
pedro-rodriguez.com
ski.rodrig...@gmail.com
208-340-1703
Mon, Jul 27, 2015 at 1:09 PM, Pedro Rodriguez
> wrote:
>
>> I am having the same issue, but the python style checks are failing on
>> the Jenkins build server. Is anyone else having this problem? Failed build
>> is here:
>> https://amplab.cs.berkeley.edu/jenkins/j
I am having the same issue, but the python style checks are failing on the
Jenkins build server. Is anyone else having this problem? Failed build is
here:
https://amplab.cs.berkeley.edu/jenkins/job/SlowSparkPullRequestBuilder/121/console
Pedro Rodriguez
On Mon, Jul 27, 2015 at 7:10 AM, Yu
? Does this seem like a good idea?
The implementation would be to have python zip the given directory into a
tmp directory, then ship that to the cluster.
--
Pedro Rodriguez
CU Boulder Phd Student
UCBerkeley 2014 | Computer Science
13 matches
Mail list logo