Doesn't DESeqDataSetFromMatrix work with data.frame only? It wouldn't work with Spark's DataFrame - try collect(countMat) and others to convert them into data.frame?
_____________________________ From: roni <roni.epi...@gmail.com> Sent: Thursday, February 18, 2016 4:55 PM Subject: cannot coerce class "data.frame" to a DataFrame - with spark R To: <user@spark.apache.org> Hi , I am trying to convert a bioinformatics R script to use spark R. It uses external bioconductor package (DESeq2) so the only conversion really I have made is to change the way it reads the input file. When I call my external R library function in DESeq2 I get error cannot coerce class "data.frame" to a DataFrame . I am listing my old R code and new spark R code below and the line giving problem is in RED. ORIGINAL R - library(plyr) library(dplyr) library(DESeq2) library(pheatmap) library(gplots) library(RColorBrewer) library(matrixStats) library(pheatmap) library(ggplot2) library(hexbin) library(corrplot) sampleDictFile <- "/160208.txt" sampleDict <- read.table(sampleDictFile) peaks <- read.table("/Atlas.txt") countMat <- read.table("/cntMatrix.txt", header = TRUE, sep = "\t") colnames(countMat) <- sampleDict$sample rownames(peaks) <- rownames(countMat) <- paste0(peaks$seqnames, ":", peaks$start, "-", peaks$end, " ", peaks$symbol) peaks$id <- rownames(peaks) ############ # SPARK R CODE peaks <- (read.csv("/Atlas.txt",header = TRUE, sep = "\t"))) sampleDict<- (read.csv("/160208.txt",header = TRUE, sep = "\t", stringsAsFactors = FALSE)) countMat<- (read.csv("/cntMatrix.txt",header = TRUE, sep = "\t")) ------------------------------------------------------------------- COMMON CODE for both - countMat <- countMat[, sampleDict$sample] colData <- sampleDict[,"group", drop = FALSE] design <- ~ group dds <- DESeqDataSetFromMatrix(countData = countMat, colData = colData, design = design) This line gives error - dds <- DESeqDataSetFromMatrix(countData = countMat, colData = (colData), design = design) Error in DataFrame(colData, row.names = rownames(colData)) : cannot coerce class "data.frame" to a DataFrame I tried as.data.frame or using DataFrame to wrap the defs , but no luck. What Can I do differently? Thanks Roni