[ https://issues.apache.org/jira/browse/SPARK-8629?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Sean Owen resolved SPARK-8629. ------------------------------ Resolution: Invalid > R code in SparkR > ---------------- > > Key: SPARK-8629 > URL: https://issues.apache.org/jira/browse/SPARK-8629 > Project: Spark > Issue Type: Question > Components: R > Reporter: Arun > Priority: Minor > > Data set: > > DC_City Dc_Code ItemNo Itemdescription dat > Month Year SalesQuantity > Hyderabad 11 100005010 more. Value Chana Dal 1 Kg. > 9/16/2012 9-Sep 2012 1 > Hyderabad 11 100005010 more. Value Chana Dal 1 Kg. > 12/21/2012 12-Dec2012 1 > Hyderabad 11 100005010 more. Value Chana Dal 1 Kg. > 1/12/2013 1-Jan 2013 1 > Hyderabad 11 100005010 more. Value Chana Dal 1 Kg. > 1/27/2013 1-Jan 2013 3 > Hyderabad 11 100005011 more. Value Chana Dal 1 Kg. > 2/1/2013 2-Feb 2013 2 > Hyderabad 11 100005011 more. Value Chana Dal 1 Kg. > 2/12/2013 2-Feb 2013 3 > Hyderabad 11 100005011 more. Value Chana Dal 1 Kg. > 2/13/2013 2-Feb 2013 2 > Hyderabad 11 100005011 more. Value Chana Dal 1 Kg. > 2/14/2013 2-Feb 2013 1 > Hyderabad 11 100005011 more. Value Chana Dal 1 Kg. > 2/15/2013 2-Feb 2013 8 > Hyderabad 11 100005012 more. Value Chana Dal 1 Kg. > 2/16/2013 2-Feb 2013 18 > Hyderabad 11 100005012 more. Value Chana Dal 1 Kg. > 2/17/2013 2-Feb 2013 19 > Hyderabad 11 100005012 more. Value Chana Dal 1 Kg. > 2/18/2013 2-Feb 2013 18 > Hyderabad 11 100005012 more. Value Chana Dal 1 Kg. > 2/19/2013 2-Feb 2013 18 > Hyderabad 11 100005012 more. Value Chana Dal 1 Kg. > 2/20/2013 2-Feb 2013 16 > Hyderabad 11 100005013 more. Value Chana Dal 1 Kg. > 2/21/2013 2-Feb 2013 25 > Hyderabad 11 100005013 more. Value Chana Dal 1 Kg. > 2/22/2013 2-Feb 2013 19 > Hyderabad 11 100005013 more. Value Chana Dal 1 Kg. > 2/23/2013 2-Feb 2013 17 > Hyderabad 11 100005013 more. Value Chana Dal 1 Kg. > 2/24/2013 2-Feb 2013 39 > Hyderabad 11 100005013 more. Value Chana Dal 1 Kg. > 2/25/2013 2-Feb 2013 23 > Code i used in R: > > data <- read.csv("D:/R/Data_sale_quantity.csv" ,stringsAsFactors=FALSE) > >factors <- unique(data$ItemNo) > > df.allitems <- data.frame() > > for(i in 1:length(factors)) > > { > > >data1 <- filter(data, ItemNo == factors[[i]]) > >data2<- > select(data1,DC_City,Itemdescription,ItemNo,date,Year,SalesQuantity) > >date2$date <- as.Date(date2$date, format = "%m/%d/%y") > >data3 <- data2[order(data2$date), ] > df.allitems <- rbind(data3 , df.allitems) # Append by row bind > } > > > > write.csv(df.allitems,"E:/all_items.csv") > You can see the code clearly in - > --------------------------------------------------------------------------------------------- > http://apache-spark-user-list.1001560.n3.nabble.com/Convert-R-code-into-SparkR-code-for-spark-1-4-version-tp23489.html > --------------------------------------------------------------------------------------------- > > I have done some SparkR code: > data1 <- read.csv("D:/Data_sale_quantity_mini.csv") # read in R > df_1 <- createDataFrame(sqlContext, data2) # converts Rdata.frame to spark > DF > factors <- distinct(df_1) # removed duplicates > > #for select i used: > df_2 <- select(distinctDF > ,"DC_City","Itemdescription","ItemNo","date","Year","SalesQuantity") # select > action > I dont know how to: > 1) create a empty sparkR DF > 2) Using for loop in SparkR > 3) change the date format. > 4) find the lenght() in spark df > 5) using rbind in sparkR > > can you help me out in doing the above code in sparkR. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org