RE: Spark 1.6.0: substring on df.select
Thanks Raghav. I have 5+ million records. I feel creating multiple come is not an optimal way. Please suggest any other alternate solution. Can’t we do any string operation in DF.Select? Regards, Raja From: Raghavendra Pandey Sent: 11 May 2016 09:04 PM To: Bharathi Raja Cc: User Subject: Re: Spark 1.6.0: substring on df.select You can create a column with count of /. Then take max of it and create that many columns for every row with null fillers. Raghav On 11 May 2016 20:37, "Bharathi Raja" wrote: Hi, I have a dataframe column col1 with values something like “/client/service/version/method”. The number of “/” are not constant. Could you please help me to extract all methods from the column col1? In Pig i used SUBSTRING with LAST_INDEX_OF(“/”). Thanks in advance. Regards, Raja
Spark 1.6.0: substring on df.select
Hi, I have a dataframe column col1 with values something like “/client/service/version/method”. The number of “/” are not constant. Could you please help me to extract all methods from the column col1? In Pig i used SUBSTRING with LAST_INDEX_OF(“/”). Thanks in advance. Regards, Raja
RE: How to Parse & flatten JSON object in a text file using Spark&Scala into Dataframe
Thanks Eran, I'll check the solution. Regards, Raja -Original Message- From: "Eran Witkon" Sent: 12/24/2015 4:07 PM To: "Bharathi Raja" ; "Gokula Krishnan D" Cc: "user@spark.apache.org" Subject: Re: How to Parse & flatten JSON object in a text file using Spark&Scala into Dataframe raja! I found the answer to your question! Look at http://stackoverflow.com/questions/34069282/how-to-query-json-data-column-using-spark-dataframes this is what you (and I) was looking for. general idea - you read the list as text where project Details is just a string field and then you build the JSON string representation of the whole line and you have a nested JSON schema which SparkSQL can read. Eran On Thu, Dec 24, 2015 at 10:26 AM Eran Witkon wrote: I don't have the exact answer for you but I would look for something using explode method on DataFrame On Thu, Dec 24, 2015 at 7:34 AM Bharathi Raja wrote: Thanks Gokul, but the file I have had the same format as I have mentioned. First two columns are not in Json format. Thanks, Raja From: Gokula Krishnan D Sent: 12/24/2015 2:44 AM To: Eran Witkon Cc: raja kbv; user@spark.apache.org Subject: Re: How to Parse & flatten JSON object in a text file using Spark &Scala into Dataframe You can try this .. But slightly modified the input structure since first two columns were not in Json format. Thanks & Regards, Gokula Krishnan (Gokul) On Wed, Dec 23, 2015 at 9:46 AM, Eran Witkon wrote: Did you get a solution for this? On Tue, 22 Dec 2015 at 20:24 raja kbv wrote: Hi, I am new to spark. I have a text file with below structure. (employeeID: Int, Name: String, ProjectDetails: JsonObject{[{ProjectName, Description, Duriation, Role}]}) Eg: (123456, Employee1, {“ProjectDetails”:[ { “ProjectName”: “Web Develoement”, “Description” : “Online Sales website”, “Duration” : “6 Months” , “Role” : “Developer”} { “ProjectName”: “Spark Develoement”, “Description” : “Online Sales Analysis”, “Duration” : “6 Months” , “Role” : “Data Engineer”} { “ProjectName”: “Scala Training”, “Description” : “Training”, “Duration” : “1 Month” } ] } Could someone help me to parse & flatten the record as below dataframe using scala? employeeID,Name, ProjectName, Description, Duration, Role 123456, Employee1, Web Develoement, Online Sales website, 6 Months , Developer 123456, Employee1, Spark Develoement, Online Sales Analysis, 6 Months, Data Engineer 123456, Employee1, Scala Training, Training, 1 Month, null Thank you in advance. Regards, Raja
How to ignore case in dataframe groupby?
Hi, Values in a dataframe column named countrycode are in different cases. Eg: (US, us). groupBy & count gives two rows but the requirement is to ignore case for this operation. 1) Is there a way to ignore case in groupBy? Or 2) Is there a way to update the dataframe column countrycode to uppercase? Thanks in advance. Regards, Raja
RE: How to Parse & flatten JSON object in a text file using Spark &Scala into Dataframe
Thanks Gokul, but the file I have had the same format as I have mentioned. First two columns are not in Json format. Thanks, Raja -Original Message- From: "Gokula Krishnan D" Sent: 12/24/2015 2:44 AM To: "Eran Witkon" Cc: "raja kbv" ; "user@spark.apache.org" Subject: Re: How to Parse & flatten JSON object in a text file using Spark &Scala into Dataframe You can try this .. But slightly modified the input structure since first two columns were not in Json format. Thanks & Regards, Gokula Krishnan (Gokul) On Wed, Dec 23, 2015 at 9:46 AM, Eran Witkon wrote: Did you get a solution for this? On Tue, 22 Dec 2015 at 20:24 raja kbv wrote: Hi, I am new to spark. I have a text file with below structure. (employeeID: Int, Name: String, ProjectDetails: JsonObject{[{ProjectName, Description, Duriation, Role}]}) Eg: (123456, Employee1, {“ProjectDetails”:[ { “ProjectName”: “Web Develoement”, “Description” : “Online Sales website”, “Duration” : “6 Months” , “Role” : “Developer”} { “ProjectName”: “Spark Develoement”, “Description” : “Online Sales Analysis”, “Duration” : “6 Months” , “Role” : “Data Engineer”} { “ProjectName”: “Scala Training”, “Description” : “Training”, “Duration” : “1 Month” } ] } Could someone help me to parse & flatten the record as below dataframe using scala? employeeID,Name, ProjectName, Description, Duration, Role 123456, Employee1, Web Develoement, Online Sales website, 6 Months , Developer 123456, Employee1, Spark Develoement, Online Sales Analysis, 6 Months, Data Engineer 123456, Employee1, Scala Training, Training, 1 Month, null Thank you in advance. Regards, Raja - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
RE: How to Parse & flatten JSON object in a text file using Spark &Scala into Dataframe
Hi Eran, I didn't get the solution yet. Thanks, Raja -Original Message- From: "Eran Witkon" Sent: 12/23/2015 8:17 PM To: "raja kbv" ; "user@spark.apache.org" Subject: Re: How to Parse & flatten JSON object in a text file using Spark &Scala into Dataframe Did you get a solution for this? On Tue, 22 Dec 2015 at 20:24 raja kbv wrote: Hi, I am new to spark. I have a text file with below structure. (employeeID: Int, Name: String, ProjectDetails: JsonObject{[{ProjectName, Description, Duriation, Role}]}) Eg: (123456, Employee1, {“ProjectDetails”:[ { “ProjectName”: “Web Develoement”, “Description” : “Online Sales website”, “Duration” : “6 Months” , “Role” : “Developer”} { “ProjectName”: “Spark Develoement”, “Description” : “Online Sales Analysis”, “Duration” : “6 Months” , “Role” : “Data Engineer”} { “ProjectName”: “Scala Training”, “Description” : “Training”, “Duration” : “1 Month” } ] } Could someone help me to parse & flatten the record as below dataframe using scala? employeeID,Name, ProjectName, Description, Duration, Role 123456, Employee1, Web Develoement, Online Sales website, 6 Months , Developer 123456, Employee1, Spark Develoement, Online Sales Analysis, 6 Months, Data Engineer 123456, Employee1, Scala Training, Training, 1 Month, null Thank you in advance. Regards, Raja