Affinity Propagation

2016-06-07 Thread Tim Gautier
Does anyone know of a good library usable in scala spark that has affinity propagation?

Re: Map tuple to case class in Dataset

2016-06-01 Thread Tim Gautier
n 1, 2016 at 9:05 AM Tim Gautier <tim.gaut...@gmail.com> wrote: > I spun up another EC2 cluster today with Spark 1.6.1 and I still get the > error. > > scala> case class Test(a: Int) > defined class Test > > scala> Seq(1,2).toDS.map(t =>

Re: Map tuple to case class in Dataset

2016-06-01 Thread Tim Gautier
ne3.$read.(:26) at $line3.$read$.(:30) at $line3.$read$.() ... 18 more On Tue, May 31, 2016 at 8:48 PM Tim Gautier <tim.gaut...@gmail.com> wrote: > That's really odd. I copied that code directly out of the shell and it > errored out on me, several times. I wonder if something I did pre

Re: Map tuple to case class in Dataset

2016-05-31 Thread Tim Gautier
ark-shell of 1.6.1 : > > scala> case class Test(a: Int) > defined class Test > > scala> Seq(1,2).toDS.map(t => Test(t)).show > +---+ > | a| > +---+ > | 1| > | 2| > +---+ > > FYI > > On Tue, May 31, 2016 at 7:35 PM, Tim Gautier <tim.gaut

Re: Map tuple to case class in Dataset

2016-05-31 Thread Tim Gautier
sion of Spark? What is the exception? >> >> On Tue, May 31, 2016 at 4:17 PM, Tim Gautier <tim.gaut...@gmail.com> >> wrote: >> >>> How should I go about mapping from say a Dataset[(Int,Int)] to a >>> Dataset[]? >>> >>> I tried to

Map tuple to case class in Dataset

2016-05-31 Thread Tim Gautier
How should I go about mapping from say a Dataset[(Int,Int)] to a Dataset[]? I tried to use a map, but it throws exceptions: case class Test(a: Int) Seq(1,2).toDS.map(t => Test(t)).show Thanks, Tim

Re: I'm pretty sure this is a Dataset bug

2016-05-27 Thread Tim Gautier
lysis.scala:59) >> at >> org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$transformUp$1.apply(TreeNode.scala:287) >> at >> org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$transformUp$1.apply(TreeNode.scala:287) >> at >> org.apache.spark.sql.catalyst.trees.CurrentOrigin

Re: Undocumented left join constraint?

2016-05-27 Thread Tim Gautier
.Int", name: "id"),- root class: "$iwC.$iwC.Test"),false,ObjectType(class $iwC$$iwC$Test),Some($iwC$$iwC@6e40bddd)) +- assertnotnull(input[1, StructType(StructField(id,IntegerType,true))].id,- field (class: "scala.Int", name: "id"),- root class: "$

Re: Undocumented left join constraint?

2016-05-27 Thread Tim Gautier
ark.sql.Dataset[Test] = [id: int] > > scala> test1.as("t1").joinWith(test2.as("t2"), $"t1.id" === $"t2.id", > "left_outer").show > +---+--+ > | _1|_2| > +---+--+ > |[1]|[null]| > |[2]| [2]| > |

Undocumented left join constraint?

2016-05-27 Thread Tim Gautier
Is it truly impossible to left join a Dataset[T] on the right if T has any non-option fields? It seems Spark tries to create Ts with null values in all fields when left joining, which results in null pointer exceptions. In fact, I haven't found any other way to get around this issue without making

Re: I'm pretty sure this is a Dataset bug

2016-05-27 Thread Tim Gautier
ped = test.map(t => t.copy(id = t.id + 1)) testMapped.as("t1").joinWith(testMapped.as("t2"), $"t1.id" === $"t2.id ").show On Fri, May 27, 2016 at 11:16 AM Tim Gautier <tim.gaut...@gmail.com> wrote: > I figured it out the trigger. Turns out it wasn'

Re: I'm pretty sure this is a Dataset bug

2016-05-27 Thread Tim Gautier
tMapped.as("t1").joinWith(testMapped.as("t2"), $"t1.id" === $"t2.id").show // <-- error On Fri, May 27, 2016 at 10:44 AM Tim Gautier <tim.gaut...@gmail.com> wrote: > I stand corrected. I just created a test table with a single int field to &g

Re: I'm pretty sure this is a Dataset bug

2016-05-27 Thread Tim Gautier
I stand corrected. I just created a test table with a single int field to test with and the Dataset loaded from that works with no issues. I'll see if I can track down exactly what the difference might be. On Fri, May 27, 2016 at 10:29 AM Tim Gautier <tim.gaut...@gmail.com> wrote: >

Re: I'm pretty sure this is a Dataset bug

2016-05-27 Thread Tim Gautier
a sql database. On Fri, May 27, 2016 at 10:15 AM Ted Yu <yuzhih...@gmail.com> wrote: > Which release of Spark are you using ? > > Is it possible to come up with fake data that shows what you described ? > > Thanks > > On Fri, May 27, 2016 at 8:24 AM, Tim Gautier <

I'm pretty sure this is a Dataset bug

2016-05-27 Thread Tim Gautier
Unfortunately I can't show exactly the data I'm using, but this is what I'm seeing: I have a case class 'Product' that represents a table in our database. I load that data via sqlContext.read.format("jdbc").options(...).load.as[Product] and register it in a temp table 'product'. For testing, I

Dataset Set Operations

2016-05-24 Thread Tim Gautier
Hello All, I've been trying to subtract one dataset from another. Both datasets contain case classes of the same type. When I subtract B from A, I end up with a copy of A that still has the records of B in it. (An intersection of A and B always results in 0 results.) All I can figure is that