GitHub user lukovnikov opened a pull request:

    https://github.com/apache/spark/pull/4650

    RDF Loader added + documentation

    Have been testing it with DBpedia dumps, works well so far.
    Any help with custom partitioning and optimization is welcome.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/lukovnikov/spark master

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/4650.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #4650
    
----
commit 10436d252ad4876d28c91c77036e3d993050438a
Author: lukovnikov <lukovnikov@denis>
Date:   2015-02-03T19:41:58Z

    fast forward from upstream

commit 595aed098fb423514b73263f96dfcaf1edbc72f5
Author: lukovnikov <lukovnikov@denis>
Date:   2015-02-03T21:41:00Z

    dictionary builder done

commit c2399023825e804476527f7e159b182a1b5c91c8
Author: lukovnikov <lukovnikov@denis>
Date:   2015-02-03T21:44:07Z

    [SPARK 5280]

commit f14e4835cf365fcbe5dd0979e61464b7cecb8774
Author: lukovnikov <lukovnikov@denis>
Date:   2015-02-03T22:50:06Z

    done dictionary version

commit 43cc53ab6d99a4a96a0764cc306f38fdce3a7e00
Author: lukovnikov <lukovnikov@denis>
Date:   2015-02-03T23:25:07Z

    [SPARK 5280] rdfloader using hashes as VertexIds

commit 2e1220d0938aee7d190439253e3b9bb1e73c77e8
Author: lukovnikov <lukovnikov@denis>
Date:   2015-02-04T00:04:48Z

    cleaned up + fixed style
    TODO: test + comment

commit 54e2c6eb24dade70753320a3ab2b3a64fef7a6d4
Author: lukovnikov <lukovnikov@denis>
Date:   2015-02-04T00:26:30Z

    made custom 64bit hash

commit b454560508c9d50c60e067d7e67405ca1e13c165
Author: lukovnikov <lukovnikov@denis>
Date:   2015-02-04T00:32:57Z

    proper

commit 45a9f57695e76c09c20fa99a1010168f63ef1da8
Author: lukovnikov <lukovnikov@denis>
Date:   2015-02-03T19:41:58Z

    fast forward from upstream

commit 6ee9a2b675d06675b5b591f16e8d52e63d2dc049
Author: lukovnikov <lukovnikov@denis>
Date:   2015-02-03T21:41:00Z

    dictionary builder done

commit 45c22160c52111066109f57a0d773aca211c2068
Author: lukovnikov <lukovnikov@denis>
Date:   2015-02-03T21:44:07Z

    [SPARK 5280]

commit fa5c0da9ea4f6ca662406b380432901022d6de55
Author: lukovnikov <lukovnikov@denis>
Date:   2015-02-03T22:50:06Z

    done dictionary version

commit c036f98476e96ac03124f758ed7f17c4a464cf86
Author: lukovnikov <lukovnikov@denis>
Date:   2015-02-03T23:25:07Z

    [SPARK 5280] rdfloader using hashes as VertexIds

commit 57553797f7404e686674b0bfb39d80bb24d6520c
Author: lukovnikov <lukovnikov@denis>
Date:   2015-02-04T00:04:48Z

    cleaned up + fixed style
    TODO: test + comment

commit e00123eae4a84108af2c84cf253b1f4fb1fb69f1
Author: lukovnikov <lukovnikov@denis>
Date:   2015-02-04T00:26:30Z

    made custom 64bit hash

commit 6af9a7ad6198174597ae7d86ec5c15fc8467a082
Author: lukovnikov <lukovnikov@denis>
Date:   2015-02-04T00:32:57Z

    proper

commit 1ee34c9474bcf4500edecb08a848d15f3549055d
Author: lukovnikov <lukovnikov@denis>
Date:   2015-02-04T03:31:05Z

    Merge branch 'master' of github.com:lukovnikov/spark into rdfloaderhash

commit 9000a4713d286d5078c16f62b5fadf480941bc82
Author: lukovnikov <lukovnikov@denis>
Date:   2015-02-04T03:31:18Z

    Merge branch 'rdfloaderhash' of github.com:lukovnikov/spark into 
rdfloaderhash

commit 70eb725a102ae711a59c6d45794d191c18778c4b
Author: lukovnikov <lukovnikov@denis>
Date:   2015-02-04T23:02:48Z

    RDF Loader with hash, tested on small RDF dumps (more tests in progress)

commit 4398d93712777442ba0f2e8920423fcdd7b67d1f
Author: Denis <lukovni...@users.noreply.github.com>
Date:   2015-02-04T23:27:01Z

    added documentation for RDFLoader

commit 273a1b30dee1630333e0f7e683378b6dbb13c3a5
Author: Denis <lukovni...@users.noreply.github.com>
Date:   2015-02-04T23:29:05Z

    small update to RDFLoader description

commit 202ccf86901c3d2435564e544f90d6a49cda66fb
Author: lukovnikov <lukovnikov@denis>
Date:   2015-02-04T23:31:10Z

    sdf

commit 2d990cec1d48f62f4f1d9f9cf8082308a4eaf9e4
Author: lukovnikov <lukovnikov@denis>
Date:   2015-02-03T19:41:58Z

    fast forward from upstream

commit 4a9b6222176749bee4a14e4b6d035b665c6ac7ea
Author: lukovnikov <lukovnikov@denis>
Date:   2015-02-04T23:43:31Z

    Merge branch 'master' of github.com:lukovnikov/spark

commit 062996c45d0443836c1b4b2bb714d8f459ea6980
Author: lukovnikov <lukovnikov@denis>
Date:   2015-02-04T23:43:52Z

    Merge branch 'rdfloaderhash'

commit 121bf14140573349424e7888da13ee2e8ea4f6f0
Author: lukovnikov <lukovnikov@denis>
Date:   2015-02-04T23:45:48Z

    [SPARK 5280]

commit 67ada514b98292ff647d8354545d37cc111499ba
Author: lukovnikov <lukovnikov@denis>
Date:   2015-02-04T23:47:21Z

    Merge branch 'rdfloaderhash' of github.com:lukovnikov/spark into 
rdfloaderhash

commit e5fcf758c0e4b54a38b2a01709681e11bbb6eae8
Author: lukovnikov <lukovnikov@denis>
Date:   2015-02-04T23:47:45Z

    Merge branch 'rdfloaderhash'

commit c5960af7b14d65b1d290c3af11d722075a54ad2d
Author: lukovnikov <lukovnikov@denis>
Date:   2015-02-04T23:54:37Z

    Merge remote-tracking branch 'upstream/master'

commit 91361f3f760dbc78467f8e2b87a1d77061aa59de
Author: lukovnikov <lukovnikov@denis>
Date:   2015-02-05T00:01:33Z

    undone unnecessary changes

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to