> On June 4, 2014, 10:53 p.m., Igor Kabiljo wrote: > > giraph-core/src/main/java/org/apache/giraph/mapping/DefaultEmbeddedLongByteOps.java, > > line 47 > > <https://reviews.apache.org/r/22234/diff/1/?file=603360#file603360line47> > > > > it is not clear that unset can ever happen - all vertex IDs and edges > > are translated (with target being -1 here) > > > > I assume you need to add 1, so that if ID is not in the map is treated > > the same as if embedTargetInfo was never called (i.e. user creating > > vertices with specific IDs later) > > Pavan Kumar Athivarapu wrote: > sorry, I mean not-set > > Igor Kabiljo wrote: > yeah I understood that - but it is not clear that things can ever be > not-set - all input is going to be transformed (as oppose to just not being > in the mapping, which will return -1, which can be encoded and read as -1 > without having +1) > > Pavan Kumar Athivarapu wrote: > but what about stuff created later - that is not in the mapping, so it > would have prefix bits 0, now 0 for them does not mean all such vertices > belong in worker 0 > and such created vertices are not translated, translation is done only in > input phase
I meant just add more explanation in the comments, leave the code as is. > On June 4, 2014, 10:53 p.m., Igor Kabiljo wrote: > > giraph-core/src/main/java/org/apache/giraph/mapping/AbstractLongByteOps.java, > > lines 39-47 > > <https://reviews.apache.org/r/22234/diff/1/?file=603359#file603359line39> > > > > as is, you don't need to override these functions? > > > > Also, it might be better to add: > > byte getEmbededInfo(long id), > > and have implementation of getPartition. > > > > (you could also implement above two methods, through two abstract > > primitive methods: > > long embedTargetInfo(long id); > > long removeTargetInfo(long id); > > Pavan Kumar Athivarapu wrote: > If u see subclasses of AbstractLongByteOps - > DefaultEmbeddedLongByteOps - this implements both embedTargetInfo & > removeTargetInfo, as it needs them > but DefaultLongByteOps - need not, because it computes partition by > looking up the MappingStore everytime it is called > > sure byte getEmbededInfo(long id) was in the initial implementation I > had, just merged it into getPartition, since that is the only place which > uses it > > so if u want it to return long, that becomes very restrictive, what about > generic vertexIds, which are not longs, the interface MappingStoreOps needs > to account for all those as well right? > > > Igor Kabiljo wrote: > Well, I would then put those unsupported implementations inside of > DefaultLongByteOps, instead of in AbstractLongByteOps. > > Also, seems like it might be better to add MappingStoreOps function > boolean hasEmbedding, instead of having flags giraph.embedInfoVertexIds and > giraph.edgeTranslationClass. (better not to have redudnant flags) > I.e. if hasEmbedding of active MappingStoreOps returns true, embedding > should be used throughout. > > > Pavan Kumar Athivarapu wrote: > makes total sense, this mappingstoreops was a last minute re-org, so did > not modify the original structure reg, embedInfo flag. > I made all the changes, any other comments? I'll prepare a diff & put it > up addressing all comments at once. Cool - everything else looks good. - Igor ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/22234/#review44768 ----------------------------------------------------------- On June 4, 2014, 3:11 p.m., Pavan Kumar Athivarapu wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/22234/ > ----------------------------------------------------------- > > (Updated June 4, 2014, 3:11 p.m.) > > > Review request for giraph, Avery Ching, Sergey Edunov, Igor Kabiljo, and Maja > Kabiljo. > > > Repository: giraph-git > > > Description > ------- > > There are many changes here: > > A new input format - MappingInputFormat - and related dependencies has been > defined > New hive input format classes to read Mapping table + some examples which can > be used for running hellopagerank with modifications in test plan have been > defined > Changes to main giraph classes to read the mapping & use it for getting > partition info - in worker, master & partition sections > New mapping section defined to define & declare MappingStore format & some > sample implementations > The code can take 2 paths based on what the user wants > > Embed info into vertexId by implementing proper contracts > (TranslateEdge.interface + 2 methods in MappingStore) > then use an EmbeddedGraphPartitioner to directly read worker info off of > vertex ids [translate once & user freely for the rest of app] > > Always read from the MappingStore (this has more overhead probably because of > cache misses in processor) [never translate but pay cost of map lookup each > time partition / worker info is needed] > > > Diffs > ----- > > giraph-core/src/main/java/org/apache/giraph/bsp/BspService.java ec0ddbb > > giraph-core/src/main/java/org/apache/giraph/bsp/CentralizedServiceMaster.java > bda967d > giraph-core/src/main/java/org/apache/giraph/conf/GiraphClasses.java 3337621 > giraph-core/src/main/java/org/apache/giraph/conf/GiraphConstants.java > 6b36418 > > giraph-core/src/main/java/org/apache/giraph/conf/ImmutableClassesGiraphConfiguration.java > 95e029d > giraph-core/src/main/java/org/apache/giraph/io/MappingInputFormat.java > PRE-CREATION > giraph-core/src/main/java/org/apache/giraph/io/MappingReader.java > PRE-CREATION > > giraph-core/src/main/java/org/apache/giraph/io/internal/WrappedMappingInputFormat.java > PRE-CREATION > > giraph-core/src/main/java/org/apache/giraph/io/internal/WrappedMappingReader.java > PRE-CREATION > > giraph-core/src/main/java/org/apache/giraph/io/iterables/MappingReaderWrapper.java > PRE-CREATION > > giraph-core/src/main/java/org/apache/giraph/mapping/AbstractLongByteOps.java > PRE-CREATION > > giraph-core/src/main/java/org/apache/giraph/mapping/DefaultEmbeddedLongByteOps.java > PRE-CREATION > giraph-core/src/main/java/org/apache/giraph/mapping/DefaultLongByteOps.java > PRE-CREATION > > giraph-core/src/main/java/org/apache/giraph/mapping/LongByteMappingStore.java > PRE-CREATION > giraph-core/src/main/java/org/apache/giraph/mapping/MappingEntry.java > PRE-CREATION > giraph-core/src/main/java/org/apache/giraph/mapping/MappingStore.java > PRE-CREATION > giraph-core/src/main/java/org/apache/giraph/mapping/MappingStoreOps.java > PRE-CREATION > giraph-core/src/main/java/org/apache/giraph/mapping/package-info.java > PRE-CREATION > > giraph-core/src/main/java/org/apache/giraph/mapping/translate/LongByteTranslateEdge.java > PRE-CREATION > > giraph-core/src/main/java/org/apache/giraph/mapping/translate/TranslateEdge.java > PRE-CREATION > > giraph-core/src/main/java/org/apache/giraph/mapping/translate/package-info.java > PRE-CREATION > giraph-core/src/main/java/org/apache/giraph/master/BspServiceMaster.java > 90dc9f3 > giraph-core/src/main/java/org/apache/giraph/master/MasterThread.java > 15dbe07 > > giraph-core/src/main/java/org/apache/giraph/partition/DefaultLongBytePartitionerFactory.java > PRE-CREATION > > giraph-core/src/main/java/org/apache/giraph/partition/GraphPartitionerFactory.java > 4200d79 > > giraph-core/src/main/java/org/apache/giraph/partition/HashPartitionerFactory.java > 7cc5651 > > giraph-core/src/main/java/org/apache/giraph/partition/HashRangePartitionerFactory.java > 1eeece7 > > giraph-core/src/main/java/org/apache/giraph/partition/SimpleIntRangePartitionerFactory.java > 8ab692f > > giraph-core/src/main/java/org/apache/giraph/partition/SimpleLongRangePartitionerFactory.java > 2989598 > > giraph-core/src/main/java/org/apache/giraph/partition/SimplePartitionerFactory.java > 15b0756 > > giraph-core/src/main/java/org/apache/giraph/partition/SimpleWorkerPartitioner.java > 600d7a3 > giraph-core/src/main/java/org/apache/giraph/worker/BspServiceWorker.java > aff7084 > > giraph-core/src/main/java/org/apache/giraph/worker/EdgeInputSplitsCallable.java > 828eac4 > > giraph-core/src/main/java/org/apache/giraph/worker/FullInputSplitCallable.java > PRE-CREATION > giraph-core/src/main/java/org/apache/giraph/worker/LocalData.java > PRE-CREATION > > giraph-core/src/main/java/org/apache/giraph/worker/MappingInputSplitsCallable.java > PRE-CREATION > > giraph-core/src/main/java/org/apache/giraph/worker/MappingInputSplitsCallableFactory.java > PRE-CREATION > > giraph-core/src/main/java/org/apache/giraph/worker/VertexInputSplitsCallable.java > e3e04d6 > > giraph-core/src/test/java/org/apache/giraph/partition/SimpleRangePartitionFactoryTest.java > 4e19cd2 > giraph-hive/src/main/java/org/apache/giraph/hive/HiveGiraphRunner.java > 603910b > > giraph-hive/src/main/java/org/apache/giraph/hive/common/GiraphHiveConstants.java > c7ad63b > giraph-hive/src/main/java/org/apache/giraph/hive/common/HiveUtils.java > 2388673 > > giraph-hive/src/main/java/org/apache/giraph/hive/input/mapping/AbstractHiveToMapping.java > PRE-CREATION > > giraph-hive/src/main/java/org/apache/giraph/hive/input/mapping/HiveMappingInputFormat.java > PRE-CREATION > > giraph-hive/src/main/java/org/apache/giraph/hive/input/mapping/HiveMappingReader.java > PRE-CREATION > > giraph-hive/src/main/java/org/apache/giraph/hive/input/mapping/HiveToMapping.java > PRE-CREATION > > giraph-hive/src/main/java/org/apache/giraph/hive/input/mapping/SimpleHiveToMapping.java > PRE-CREATION > > giraph-hive/src/main/java/org/apache/giraph/hive/input/mapping/examples/LongByteHiveToMapping.java > PRE-CREATION > > giraph-hive/src/main/java/org/apache/giraph/hive/input/mapping/examples/LongInt2ByteHiveToMapping.java > PRE-CREATION > > giraph-hive/src/main/java/org/apache/giraph/hive/input/mapping/examples/package-info.java > PRE-CREATION > > giraph-hive/src/main/java/org/apache/giraph/hive/input/mapping/package-info.java > PRE-CREATION > > Diff: https://reviews.apache.org/r/22234/diff/ > > > Testing > ------- > > ran pagerank jobs multiple times > mvn clean verify > > > Thanks, > > Pavan Kumar Athivarapu > >