i am observing that on a particular system (spark) my code breaks in that
avro does not return the specific record i expected but instead returns
generic records.
i suspect this is some class loading issue on the distributed system
(something about how the classpath is constructed for the spark
then the generic
representation is used. So, yes, this sounds like a classpath
problem.
On Mon, Oct 21, 2013 at 8:41 AM, Koert Kuipers ko...@tresata.com wrote:
i am observing that on a particular system (spark) my code breaks in that
avro does not return the specific record i expected
i had this too ad some point. i just added paranemer to distributed cache
(or classpath on hadoop) and it went away
On Thu, Dec 13, 2012 at 2:21 PM, Terry Healy the...@bnl.gov wrote:
paran
try this first before going down the self
patching route.
Regards
Jacob
--
From: kkrugler_li...@transpac.com
Subject: Re: version of avro
Date: Fri, 19 Oct 2012 13:16:24 -0700
To: user@avro.apache.org
On Oct 19, 2012, at 1:03pm, Koert Kuipers wrote:
i
we are on a fairly old avro (1.5.4) so not sure my observations apply to
newer versions. i noticed that when i read from avro files in hadoop it
does not expect the reader's schema (fully qualified) name to be equal to
the writer's schema (fully qualified) name. this allows me to read from
files
i noticed avro version 1.5.4 is included with some version/distros of
hadoop and hive... is there a reason why 1.5.4 is included specifically and
not newer ones? are there some incompatibilities to be aware of? i would
like to use a newer version
thanks! koert
how do i tell (generic) avro to use strings for values instead of it's own
utf8 class?
i saw a way of doing it by modifying the schemas (adding a property). i
also saw mention of a way to do it if you use maven (which i don't).
is there a generic way to do this? like a system property perhaps?
thanks doug
On Fri, Feb 3, 2012 at 3:58 PM, Doug Cutting cutt...@apache.org wrote:
On 02/02/2012 08:03 PM, Koert Kuipers wrote:
i have many avro files with similar data (same meaning, same type, etc.)
but different names for the fields.
can i create a reader schema that for each field
ok i will do thanks
On Fri, Feb 3, 2012 at 7:26 PM, Doug Cutting cutt...@apache.org wrote:
On 02/03/2012 01:57 PM, Koert Kuipers wrote:
I could create a copy myself using the Field constructor, however that
way i lose the aliases and props. In avro 1.5.4 there is no way to get
i have many avro files with similar data (same meaning, same type, etc.)
but different names for the fields.
can i create a reader schema that for each field that i am interested in
maps it to all the different possible fields in the files by using aliases,
and then run map-reduce over the files
we are working on a very sparse table with say 500 columns where we do
batch uploads that typically only contain a subset of the columns (say
100), and we run multiple map-reduce queries on subsets of the columns
(typically less than 50 columns go into a single map-reduce job).
my question is the
Is there a way to override the avro generic representation, or perhaps an
easy way to create my own?
For example, for FIXED i would like Byte[] instead of ByteBuffer, for
STRING i would prefer String over CharArray, for arrays i would like to
have a List instead of a Collection, etc.
Right now i
I am reading from avro container files in hadoop. I know the container
files have a (writers) schema stored in them. My reader specifies it's
schema using avro.input.schema job parameter. This way any schema changes
are gracefully handled with both schema's present.
However, i dont always need
If i use Avro in hadoop (and read my data from Avro container files), will
i automatically get a very fast comparison for sorting in Hadoop (similar
to what WritableComparator provides)? Are there benchmarks on sorting with
Avro vs Writables?
Best, Koert
14 matches
Mail list logo