Hi Ted, Thanks for getting back - I realised my mistake... I was clicking the little drop down menu on the right hand side of the Create button (it looks as if it's part of the button) - when I clicked directly on the word "Create" I got a form that made more sense and allowed me to choose the project.
Regards, James On 7 March 2016 at 13:09, Ted Yu <yuzhih...@gmail.com> wrote: > Have you tried clicking on Create button from an existing Spark JIRA ? > e.g. > https://issues.apache.org/jira/browse/SPARK-4352 > > Once you're logged in, you should be able to select Spark as the Project. > > Cheers > > On Mon, Mar 7, 2016 at 2:54 AM, James Hammerton <ja...@gluru.co> wrote: > >> Hi, >> >> So I managed to isolate the bug and I'm ready to try raising a JIRA >> issue. I joined the Apache Jira project so I can create tickets. >> >> However when I click Create from the Spark project home page on JIRA, it >> asks me to click on one of the following service desks: Kylin, Atlas, >> Ranger, Apache Infrastructure. There doesn't seem to be an option for me to >> raise an issue for Spark?! >> >> Regards, >> >> James >> >> >> On 4 March 2016 at 14:03, James Hammerton <ja...@gluru.co> wrote: >> >>> Sure thing, I'll see if I can isolate this. >>> >>> Regards. >>> >>> James >>> >>> On 4 March 2016 at 12:24, Ted Yu <yuzhih...@gmail.com> wrote: >>> >>>> If you can reproduce the following with a unit test, I suggest you open >>>> a JIRA. >>>> >>>> Thanks >>>> >>>> On Mar 4, 2016, at 4:01 AM, James Hammerton <ja...@gluru.co> wrote: >>>> >>>> Hi, >>>> >>>> I've come across some strange behaviour with Spark 1.6.0. >>>> >>>> In the code below, the filtering by "eventName" only seems to work if I >>>> called .cache on the resulting DataFrame. >>>> >>>> If I don't do this, the code crashes inside the UDF because it >>>> processes an event that the filter should get rid off. >>>> >>>> Any ideas why this might be the case? >>>> >>>> The code is as follows: >>>> >>>>> val df = sqlContext.read.parquet(inputPath) >>>>> val filtered = df.filter(df("eventName").equalTo(Created)) >>>>> val extracted = extractEmailReferences(sqlContext, >>>>> filtered.cache) // Caching seems to be required for the filter to work >>>>> extracted.write.parquet(outputPath) >>>> >>>> >>>> where extractEmailReferences does this: >>>> >>>>> >>>> >>>> def extractEmailReferences(sqlContext: SQLContext, df: DataFrame): >>>>> DataFrame = { >>>> >>>> val extracted = df.select(df(EventFieldNames.ObjectId), >>>> >>>> extractReferencesUDF(df(EventFieldNames.EventJson), >>>>> df(EventFieldNames.ObjectId), df(EventFieldNames.UserId)) as "references") >>>> >>>> >>>>> extracted.filter(extracted("references").notEqual("UNKNOWN")) >>>> >>>> } >>>> >>>> >>>> and extractReferencesUDF: >>>> >>>>> def extractReferencesUDF = udf(extractReferences(_: String, _: String, >>>>> _: String)) >>>> >>>> def extractReferences(eventJson: String, objectId: String, userId: >>>>> String): String = { >>>>> import org.json4s.jackson.Serialization >>>>> import org.json4s.NoTypeHints >>>>> implicit val formats = Serialization.formats(NoTypeHints) >>>>> >>>>> val created = Serialization.read[GMailMessage.Created](eventJson) >>>>> // This is where the code crashes if the .cache isn't called >>>> >>>> >>>> Regards, >>>> >>>> James >>>> >>>> >>> >> >