Michael Armbrust created SPARK-1678:
---------------------------------------

             Summary: Compression loses repeated values.
                 Key: SPARK-1678
                 URL: https://issues.apache.org/jira/browse/SPARK-1678
             Project: Spark
          Issue Type: Bug
          Components: SQL
            Reporter: Michael Armbrust
            Assignee: Cheng Lian
            Priority: Blocker
             Fix For: 1.0.0


Here's a test case:

{code}
  test("all the same strings") {
    sparkContext.parallelize(1 to 1000).map(_ => 
StringData("test")).registerAsTable("test1000")
    assert(sql("SELECT * FROM test1000").count() === 1000)
    cacheTable("test1000")
    assert(sql("SELECT * FROM test1000").count() === 1000)
  }
{code}

First assert passes, second one fails.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to