[rg] <https://www.redpointglobal.com/>
John Lilley
Data Management Chief Architect, Redpoint Global Inc.
888 Worcester Street, Suite 200 Wellesley, MA 02482
M: +1 7209385761 |
john.lil...@redpointglobal.com<mailto:john.lil...@redpointglobal.com>
PLEASE NOTE: This e-mail f
t;
> >> >>>> On Mon, Mar 2, 2020 at 6:30 PM Sree Vaddi
> >> >>>> wrote:
> >> >>>>
> >> >>>>> +1
> >> >>>>>
> >> >>>>> Sent from Yahoo Mail on Android
> >> >>>>>
> >> >>>>> On Mon, Mar 2, 2020 at 5:12 PM, Wei-Chiu Chuang<
> weic...@apache.org>
> >> >>>>> wrote: Hi,
> >> >>>>>
> >> >>>>> Following the discussion to end branch-2.8, I want to start a
> >> >>>>> discussion
> >> >>>>> around what's next with branch-2.9. I am hesitant to use the word
> "end
> >> >>>>> of
> >> >>>>> life" but consider these facts:
> >> >>>>>
> >> >>>>> * 2.9.0 was released Dec 17, 2017.
> >> >>>>> * 2.9.2, the last 2.9.x release, went out Nov 19 2018, which is
> more
> >> >>>>> than
> >> >>>>> 15 months ago.
> >> >>>>> * no one seems to be interested in being the release manager for
> 2.9.3.
> >> >>>>> * Most if not all of the active Hadoop contributors are using
> Hadoop
> >> >>>>> 2.10
> >> >>>>> or Hadoop 3.x.
> >> >>>>> * We as a community do not have the cycle to manage multiple
> release
> >> >>>>> line,
> >> >>>>> especially since Hadoop 3.3.0 is coming out soon.
> >> >>>>>
> >> >>>>> It is perhaps the time to gradually reduce our footprint in Hadoop
> >> >>>>> 2.x, and
> >> >>>>> encourage people to upgrade to Hadoop 3.x
> >> >>>>>
> >> >>>>> Thoughts?
> >> >>>>>
> >> >>>>>
>
--
John Zhuge
happens by trial-and-error, and then needing to figure out how to get the
thread-context-classloader set at the appropriate place and time.
John Lilley
-Original Message-
From: Sean Busbey
Sent: Tuesday, February 5, 2019 10:42 AM
To: John Lilley
Cc: user@hadoop.apache.org
Subject: Re
y? If not, is there a list of such classes?
Thanks,
John Lilley
and MapReduce do
this somehow, but we haven't been able to dig out the specifics from the code.
John Lilley
Nevermind. I found my stupid mistake. I didn’t reset a variable…this fact had
escaped me for the past two days.
From: "Avery, John"
Date: Wednesday, December 27, 2017 at 4:20 PM
To: "user@hadoop.apache.org"
Subject: Help me understand hadoop caching behavior
I’m writing a
I’m writing a program using the C API for Hadoop. I have a 4-node cluster.
(Cluster was setup according to
https://www.tutorialspoint.com/hadoop/hadoop_multi_node_cluster.htm) Of the 4
nodes, one is the namenode and a datanode, the others are datanodes (with one
being a secondary namenode).
I’
Hi Folks,
The Apache Gora team are pleased to announce the immediate availability of
Apache Gora 0.7.
The Apache Gora open source framework provides an in-memory data model and
persistence for big data. Gora supports persisting to column stores, key
value stores, document stores and RDBMSs, and an
ood design, coupling your input readers directly
with output writers. Instead, put the writers in separate threads and push
byte arrays to be written to them via a queue.
John Lilley
From: Dmitry Goldenberg [mailto:dgoldenberg...@gmail.com]
Sent: Wednesday, May 25, 2016 9:12 PM
To: user@hadoop.
That's it! Thanks.
John Lilley
From: Chris Nauroth [mailto:cnaur...@hortonworks.com]
Sent: Tuesday, May 24, 2016 10:24 AM
To: John Lilley ; 'user@hadoop.apache.org'
Subject: Re: Filing a JIRA
Something is definitely odd about the UI there. From your second link, can yo
to here:
https://issues.apache.org/jira/browse/HADOOP/?selectedTab=com.atlassian.jira.jira-projects-plugin:summary-panel
But still, creating a bug using the "Bug -> Create Detailed" menu files it
against Atlas.
John Lilley
From: Chris Nauroth [mailto:cnaur...@hortonworks.com]
Sent: Monday, M
I am having trouble filing a bug report. I was trying this:
https://issues.apache.org/jira/servicedesk/customer/portal/5/create/27
But this doesn't seem right and refuses my request. Can someone point me to
the right place?
Thanks
John L
2048MB. We are using the default Capacity
Scheduler.
Is this a configuration error on our part or has the Resource Manager somehow
returned the wrong size container allocation? Should we simply reject small
containers and wait for the RM to find a larger one for us?
John Lilley
ice.datalever.com_45454
Only the first worker node has the higher file limit. The rest have lower
limits.
I have verified this on two separate clusters now. The same discrepencies are
observed by looking at /proc//limits for the datanode processes on each
worker node.
This is looking like
oot soft nofile 65536
root hard nofile 65536
But none of this seems to affect the RM/NM limits.
Thanks
john
<>
Hello!
Important message, visit http://bookreviewsrus.com/appear.php?dfw4
John Hu
to understand why this process diverges from reloginFromKeytab(),
which works just fine.
John Lilley
From: Zheng, Kai [mailto:kai.zh...@intel.com]
Sent: Monday, August 17, 2015 5:40 PM
To: user@hadoop.apache.org
Subject: RE: UserGroupInformation and login with password
Hi John,
Login from keytab is mostly e
It seems like this should
be something simple. We do need to support password, because many of our
customers do not allow keytabs.
Thanks
John Lilley
<>
Follow-up, this is indeed a YARN bug and I've filed a JIRA, which has garnered
a lot of attention and a patch.
john
From: John Lilley [mailto:john.lil...@redpoint.net]
Sent: Friday, April 17, 2015 1:01 PM
To: 'user@hadoop.apache.org'
Subject: Error in YARN localization with Active
files not matching
exactly. Unfortunately we made so many attempts that I can’t now recall
exactly what we did to bring it all into line.
john
From: Alexander Alten-Lorenz [mailto:wget.n...@gmail.com]
Sent: Wednesday, March 25, 2015 3:28 AM
To: user@hadoop.apache.org
Subject: Re: Trusted-realm
1/g<mailto:%5e.*@OFFICE\.DATALEVER\.COM$)s/%5e(.*)@OFFICE\.DATALEVER\.COM$/office\\$1/g>
Thanks
John Lilley
<>
at Edge AD controller trusts an "enterprise" AD controller. Trying
to authenticate using the password equivalent of
UserGroupInformation.loginUserFromKeytab() with a user in the "enterprise"
realm fails, while a user in the "edge" realm succeeds.
Thanks
John
<>
Please disregard. Issue resolved.
-John
From: John Beaulaurier -X (jbeaulau - ADVANCED NETWORK INFORMATION INC at Cisco)
Sent: Wednesday, November 26, 2014 9:34 AM
To: user@hadoop.apache.org
Subject: SSH passwordless & Hadoop starup/shutdown scripts
Hello,
I had originally configured our
,hostname:
Name or service not knows
Can someone suggest where I should be looking for configuration issues?
Thank you
-John
fter reading the first line of the next record.
I would be very thankful for any help with implementing such a RecordReader.
Thanks in advance,John.
request properly, but *fails to somehow
resolve the rack at each of nodeUpdates* (never mind the resource limit of
virtualCores: I am using DefaultResourceCalculator, which only looks at the
memory)
I would appreciate any advice or suggestions.
Best regards,
John
Rack
We are seeing a warning deep in HDFS code, I was wondering if anyone knows of
this or if a JIRA has been filed or fixed? Searching on the warning text
didn't crop up anything. It is not harmful AFAIK.
John
[Dynamic-linking native method java.net.NetworkInterface.init ... JNI]
WARNI
n properly. Any C++ thread calling AttachCurrentThread() must then
fetch the system class loader and set it to be the thread's context class
loader:
java.lang.Thread.currentThread().setContextClassLoader(java.lang.ClassLoader.getSystemClassLoader());
john
From: John Lilley [mailto:john.li
Srikanth,
The cluster is idle while balancing, and it seems to move about 2MB/minute.
There is no discernable CPU load.
john
From: Srikanth upputuri [mailto:srikanth.upput...@huawei.com]
Sent: Thursday, September 04, 2014 12:57 AM
To: user@hadoop.apache.org
Subject: RE: How can I increase the
I have also found that neither
dfsadmin - setBalanacerBandwidth
nor
dfs.datanode.balance.bandwidthPerSec’
have any notable effect on apparent balancer rate. This is on Hadoop 2.2.0
john
From: cho ju il [mailto:tjst...@kgrid.co.kr]
Sent: Wednesday, September 03, 2014 12:55 AM
To: user
Can you run the load from an "edge node" that is not a DataNode?
john
John Lilley
Chief Architect, RedPoint Global Inc.
1515 Walnut Street | Suite 300 | Boulder, CO 80302
T: +1 303 541 1516 | M: +1 720 938 5761 | F: +1 781-705-2077
Skype: jlilley.redpoint | john.lil...@re
I think I found it:
yarn.nodemanager.delete.debug-delay-sec
From: John Lilley [mailto:john.lil...@redpoint.net]
Sent: Tuesday, September 02, 2014 2:02 PM
To: 'user@hadoop.apache.org'
Subject: YARN userapp cache lifetime: can't find core dump
We have a YARN task that is core-dumpi
Shahab,
Thanks, but I think that is just for log aggregation.
I want to retain the entire localized directory structure for a YARN task,
including any files written to that place, after the task has exited.
John
From: Shahab Yunus [mailto:shahab.yu...@gmail.com]
Sent: Tuesday, September 02
below here is empty
/data2/hadoop/yarn/local/usercache/jlilley/appcache
I seem to recall there is a YARN setting to control the time these files are
kept around after application exit, but I can't figure out what it is.
Thanks,
john
<>
Call addDelegationTokens() to extract delegated Credentials for HDFS
and keep them around.
Once this has been done, it appears tha tall is well. We can use those
Credentials in the YARN application master launch context.
john
From: John Lilley [mailto:john.lil...@redpoint.net]
Sent: Sunday, August 24, 2014 11:
Following up on this, I was able to extract a winutils.exe and Hadoop.dll from
a Hadoop install for Windows, and set up HADDOP_HOME and PATH to find them. It
makes no difference to security, apparently.
John
From: John Lilley [mailto:john.lil...@redpoint.net]
Sent: Saturday, August 23, 2014 2
these errors occur,
messages come from Hadoop like those below. Is it possible that this is
leading to our security failures? (I posted previously about that problem but
got no response). What does winutils.exe have to do with security, if anything?
Thanks
john
The relevant portions of the log
(Client.java:654)
... 24 more
Oddly, the multi-thread access pattern works in pure Java, it just fails when
performed from C++ via JNI. We are very careful to maintain global JNI
references etc... the JNI interface works flawlessly in all other cases. Any
ideas?
Thanks
John
<>
The only way we've found to write an application master is by example. However,
the distributed-shell example in 2.2 was not very good; it is much improved in
2.4. I would start with that and edit, rather than create from scratch.
john
-Original Message-
From: Никитин Конст
Unsubscribe
On Jul 16, 2014 5:00 PM, "Xiaohua Chen" wrote:
> Hi Experts,
>
> I am new to Hadoop. I would like to get some help from you:
>
> Our current HDFS java client works fine with hadoop server which has
> NO Kerberos security enabled. We use HDFS lib e.g.
> org.apache.hadoop.fs.*.
>
> No
I'm try to silence the output of a hdfs load script (snippets below), but
continue to get output. This causes issues when the multiple load scripts are
put into background.
Am I missing something? Are there switches I could use?any info is
appreciated.John
hadoop fs -mkdir /logs/firsttier/2014/D
Thanks. Apologies, I should have gone there first.
john
From: Ted Yu [mailto:yuzhih...@gmail.com]
Sent: Tuesday, July 08, 2014 11:04 AM
To: common-u...@hadoop.apache.org
Subject: Re: HBase metadata
I have forwarded the original post to user@hbase
FYI
On Tue, Jul 8, 2014 at 10:01 AM, Martin
Sorry to be rude, but what does everyone actually use now? We are an ISV and
need to support the most common access pattern.
john
From: Martin, Nick [mailto:nimar...@pssd.com]
Sent: Tuesday, July 08, 2014 10:53 AM
To: user@hadoop.apache.org
Subject: RE: HBase metadata
Have you looked @ Lingual
Those look intriguing. But what do people actually use today? Is it all
application-specific coding? Hive?
John
From: Mirko Kämpf [mailto:mirko.kae...@gmail.com]
Sent: Tuesday, July 08, 2014 10:12 AM
To: user@hadoop.apache.org
Subject: Re: HBase metadata
Hi John,
I suggest the project: http
question about metadata standards.
What do users mostly do to use HBase for row-oriented access? It is always
going through Hive?
Thanks
john
.
Cheers,
john
From: Arun Murthy [mailto:a...@hortonworks.com]
Sent: Wednesday, June 25, 2014 11:50 PM
To: user@hadoop.apache.org
Subject: Re: persisent services in Hadoop
John,
We are excited to see ISVs like you get value from YARN, and appreciate the
patience you've already shown in the pa
describes how to add a job to this jar?
-John
have an AM that creates a set of YARN tasks and just waits
until YARN gives a task on each node, and restart any failed tasks, but it
doesn't really fit the AM/container structure very well. I've also read about
Slider, which looks interesting. Other ideas?
--john
require
direct connections to each DataNode? Does such an Edge Node proxy all of those
connections automatically, or does our software need to be made aware of this
convention somehow?
Thanks,
John
From: Rishi Yadav [mailto:ri...@infoobjects.com]
Sent: Saturday, June 07, 2014 8:20 AM
To: user
. Add the following property to hdfs-site.xml
dfs.web.ugi
webgroup
Is a restart of dfs, or mapred, or both necessary after the adding the property?
Thanks
-John
to disk contention.
john
From: Natarajan, Prabakaran 1. (NSN - IN/Bangalore)
[mailto:prabakaran.1.natara...@nsn.com]
Sent: Thursday, June 12, 2014 12:00 AM
To: user@hadoop.apache.org
Subject: Hadoop SAN Storage reuse
Hi
I know SAN storage is not recommended for Hadoop.But we don't
nformation more easily, like from
a web API (where at least we'd only need one address and port)?
john
files to indicate the
key split.
But this kind of begs the question “why”? MapReduce has built-in support for
data partitioning on the fly in the “mappers” and you don’t really need to do
anything. Is that too slow for your needs?
john
From: Mirko Kämpf [mailto:mirko.kae...@gmail.com]
Sent
Hi, can anyone help me with this?
From: John Lilley [mailto:john.lil...@redpoint.net]
Sent: Sunday, April 20, 2014 3:40 PM
To: user@hadoop.apache.org
Subject: HDFS and YARN security and interface impacts
We have an application that interfaces directly to HDFS and YARN (no
MapReduce). It does
ll be MapReduce.
For a "native" YARN/HDFS application, what changes if any must be made to the
API calls to support Kerberos or other authentication? Does it just happen
automatically at the OS level using the authenticated user ID of the process?
If there's a good reference I'd appreciate it.
john
Also, "Source Compatibility" also means ONLY a recompile is needed.
No code changes should be needed.
On Mon, Apr 14, 2014 at 10:37 AM, John Meagher wrote:
> Source Compatibility = you need to recompile and use the new version
> as part of the compilation
>
> Binary Compat
Source Compatibility = you need to recompile and use the new version
as part of the compilation
Binary Compatibility = you can take something compiled against the old
version and run it on the new version
On Mon, Apr 14, 2014 at 9:19 AM, Radhe Radhe
wrote:
> Hello People,
>
> As per the Apache s
t.connect.timeout", "7000");
before calling FileSystem.get() but it doesn't seem to matter.
What is the prescribed technique for lowering connection timeout to HDFS?
Thanks
john
This discussion may also be relevant to your question:
http://stackoverflow.com/questions/21005643/container-is-running-beyond-memory-limits
Do you actually need to specify that -Xmx6000m for java heap or could it be one
of the other issues discussed?
John
From: John Lilley [mailto:john.lil
Could you have a pmem-vs-vmem issue as in:
http://stackoverflow.com/questions/8017500/specifying-memory-limits-with-hadoop
john
From: praveenesh kumar [mailto:praveen...@gmail.com]
Sent: Tuesday, March 25, 2014 7:38 AM
To: user@hadoop.apache.org
Subject: Re: Hadoop Takes 6GB Memory to run one
Does "netstat -an | grep LISTEN" show these ports being listened on?
Can you stat hdfs from the command line e.g.:
hdfs dfsadmin -report
hdfs fsck /
hdfs dfs -ls /
Also, check out /var/log/hadoop or /var/log/hdfs for more details.
john
From: Mahmood Naderan [mailto:nt_mahm...@yahoo
Wangda Tan,
Thanks for your reply! We did actually figure out where the problem was coming
from, but this is a very helpful technique to know.
John
From: Wangda Tan [mailto:wheele...@gmail.com]
Sent: Wednesday, March 26, 2014 6:35 PM
To: user@hadoop.apache.org
Subject: Re: Getting error
the total
command-line-argument + environment space.
Cheers,
John
From: Azuryy [mailto:azury...@gmail.com]
Sent: Wednesday, March 26, 2014 5:13 PM
To: user@hadoop.apache.org
Subject: Re: Getting error message from AM container launch
You used 'nice' in your app?
Sent from my iPhone5
On further examination they appear to be 369 characters long. I've read about
similar issues showing when the environment exceeds 132KB, but we aren't
putting anything significant in the environment.
John
From: John Lilley [mailto:john.lil...@redpoint.net]
Sent: Wednesday, March 26,
We do have a fairly long container command-line. Not huge, around 200
characters.
John
From: John Lilley [mailto:john.lil...@redpoint.net]
Sent: Wednesday, March 26, 2014 4:38 PM
To: user@hadoop.apache.org
Subject: Getting error message from AM container launch
Running a non-MapReduce YARN
Running a non-MapReduce YARN application, one of the containers launched by the
AM is failing with an error message I've never seen. Any ideas? I'm not sure
who exactly is running "nice" or why its argument list would be too long.
Thanks
john
Container for appattempt_13957
The balancer is not what handles adding extra replicas in the case of
a node failure, but it looks like the balancer bandwidth setting is
the way to throttle. See:
http://mail-archives.apache.org/mod_mbox/hadoop-hdfs-user/201301.mbox/%3c50f870c1.5010...@getjar.com%3E
On Wed, Mar 26, 2014 at 10:51
Never mind... we figured out its DNS entry was going missing.
john
From: John Lilley [mailto:john.lil...@redpoint.net]
Sent: Thursday, March 13, 2014 2:52 PM
To: user@hadoop.apache.org
Subject: ResourceManager shutting down
We have this erratic behavior where every so often the RM will shutdown
We have this erratic behavior where every so often the RM will shutdown with an
UnknownHostException. The odd thing is, the host it complains about have been
in use for days at that point without problem. Any ideas?
Thanks,
John
2014-03-13 14:38:14,746 INFO rmapp.RMAppImpl
,
Stanley Shi,
[http://www.gopivotal.com/files/media/logos/pivotal-logo-email-signature.png]
On Fri, Mar 7, 2014 at 1:46 AM, John Lilley
mailto:john.lil...@redpoint.net>> wrote:
How would I go about fetching configuration values (e.g. yarn-site.xml) from
the cluster via the API from an appli
How would I go about fetching configuration values (e.g. yarn-site.xml) from
the cluster via the API from an application not running on a cluster node?
Thanks
John
I've used it regularly and it works great. It has come up on the lists
occasionally, but not that often. A more recent version is available at
https://github.com/edwardcapriolo/filecrush
On Thu, Feb 27, 2014 at 12:26 PM, Devin Suiter RDX wrote:
> Hi,
>
> Has anyone used Hadoop Filecrush?
>
>
Ah... found the answer. I had to manually leave safe mode to delete the
corrupt files.
john
From: John Lilley [mailto:john.lil...@redpoint.net]
Sent: Tuesday, March 04, 2014 9:33 AM
To: user@hadoop.apache.org
Subject: RE: Need help: fsck FAILs, refuses to clean up corrupt fs
More information
More information from the NameNode log. I don't understand... it is saying
that I cannot delete the corrupted file until the NameNode leaves safe mode,
but it won't leave safe mode until the file system is no longer corrupt. How
do I get there from here?
Thanks
john
2014-03-04 06
gopivotal.com/files/media/logos/pivotal-logo-email-signature.png]
On Mon, Mar 3, 2014 at 11:49 PM, John Pauley
mailto:john.pau...@threattrack.com>> wrote:
This is cross posted to avro-user list
(http://mail-archives.apache.org/mod_mbox/avro-user/201402.mbox/%3ccf3612f6.94d2%25john.pau.
I have a file system with some missing/corrupt blocks. However, running hdfs
fsck -delete also fails with errors. How do I get around this?
Thanks
John
[hdfs@metallica yarn]$ hdfs fsck -delete
/rpdm/tmp/ProjectTemp_461_40/TempFolder_4/data00012_00.dld
Connecting to namenode via http
OK, restarting all services now fsck shows under-replication. Was it the
NameNode restart?
John
From: John Lilley [mailto:john.lil...@redpoint.net]
Sent: Tuesday, March 04, 2014 5:47 AM
To: user@hadoop.apache.org
Subject: decommissioning a node
Our cluster has a node that reboot randomly. So
that this node
is really gone, and it should start replicating the missing blocks?
Thanks
John
mReduceTasks(0);
AvroJob.setOutputKeySchema(job, schema);
AvroMultipleOutputs.addNamedOutput(job, "avro",
AvroKeyOutputFormat.class, schema);
FileInputFormat.addInputPath(job, new Path(("/tmp/avrotest/input")));
FileOutputFormat.setOutputPath(job, new Path("/tmp/avrotest/output"));
return (job.waitForCompletion(true) ? 0 : 1);
}
public static void main(String[] args) throws Exception {
int exitCode = ToolRunner.run(new AvroContainerFileDriver(), args);
System.exit(exitCode);
}
}
Thanks,
John Pauley
Sr. Software Engineer
ThreatTrack Security
rce with checksum d23ee1d271c6ac5bd27de664146be2
This command was run using /usr/lib/hadoop/hadoop-common-2.2.0.2.0.6.0-76.jar
Thanks
John
Actually it does show in Ambari, but the only time I've seen it is when adding
a new host it shows in the "other registered hosts" list.
john
From: John Lilley [mailto:john.lil...@redpoint.net]
Sent: Saturday, March 01, 2014 12:32 PM
To: user@hadoop.apache.org
Subject: how to rem
: 0 (0 B)
DFS Used: 0 (0 B)
Non DFS Used: 0 (0 B)
DFS Remaining: 0 (0 B)
DFS Used%: 100.00%
DFS Remaining%: 0.00%
Last contact: Fri Feb 28 18:20:38 MST 2014
Ambari doesn't show it at all in the hosts. How do I remove it so I can re-add
it without conflict?
john
Yes I have, and I'm talking to them now about getting a sample file. They may
be nice and give me a large file. I was also hoping to find "real" data if
possible.
Thanks,
john
From: Ted Yu [mailto:yuzhih...@gmail.com]
Sent: Saturday, March 01, 2014 8:43 AM
To: common-u...@ha
I would like to explore Call Data Record (CDR aka Call Detail Record) analysis,
and to that end I'm looking for a large (GB+) CDR file or a program to
synthesize a somewhat-realistic sample file. Does anyone know where to find
such a thing?
Thanks
John
one full round of retries of the underlying
ipc. In each ClientRMProxy retry, the max number of underlying ipc retry is
controlled by "ipc.client.connect.max.retries".
Did you try setting both ?
Jian
On Wed, Feb 12, 2014 at 8:36 AM, John Lilley
mailto:john.lil.
quot;, "2");
Also does not help. Is there a retry parameter that can be set?
Thanks
John
From: John Lilley [mailto:john.lil...@redpoint.net]
Sent: Monday, February 10, 2014 12:12 PM
To: user@hadoop.apache.org
Subject: RE: very long timeout on failed RM connect
I tried:
conf.set(&q
I tried:
conf.set("yarn.resourcemanager.connect.max-wait.ms", "1");
conf.set("yarn.resourcemanager.connect.retry-interval.ms", "1000");
But it has no apparent effect. Still hangs for a very long time.
john
From: Jian He [mailto:j...@hortonworks.com]
ategy exists, in order to prevent a
highly-loaded cluster from failing jobs. But it is not appropriate for an
interactive application.
Thanks
John
Thanks. Experimentally, I have found that changing the buffers sizes has no
effect, so that makes sense.
John
From: Arpit Agarwal [mailto:aagar...@hortonworks.com]
Sent: Tuesday, January 28, 2014 12:35 AM
To: user@hadoop.apache.org
Subject: Re: HDFS buffer sizes
Looks like
Thanks! I would have never found that.
john
From: Ted Yu [mailto:yuzhih...@gmail.com]
Sent: Monday, January 27, 2014 4:57 PM
To: common-u...@hadoop.apache.org
Subject: Re: HDFS read stats
FSDataInputStream has this javadoc:
/** Utility that wraps a {@link FSInputStream} in a {@link
, multi-process access to the same set of files.
3) Replication=1 having an influence.
Any ideas? I am not seeing any errors in the datanode logs.
I will run some other tests with replication=3 to see what happens.
John
From: John Lilley [mailto:john.lil...@redpoint.net]
Sent: Monday, Januar
Ummm... so if I've called FileSystem.open() with an hdfs:// path, and it
returns an FSDataInputStream, how do I get from there to the DFSInputStream
that you say has the interface I want?
Thanks
John
From: Ted Yu [mailto:yuzhih...@gmail.com]
Sent: Sunday, January 26, 2014 6:16 PM
To: com
What exception would I expect to get if this limit was exceeded?
john
From: Harsh J [mailto:ha...@cloudera.com]
Sent: Monday, January 27, 2014 8:12 AM
To:
Subject: Re: HDFS open file limit
Hi John,
There is a concurrent connections limit on the DNs that's set to a default of
4k max par
for any
errors?
On Jan 27, 2014 8:37 PM, "John Lilley"
mailto:john.lil...@redpoint.net>> wrote:
I am getting this perplexing error. Our YARN application launches tasks that
attempt to simultaneously open a large number of files for merge. There seems
to be a load threshold in te
However, I would still like to know what limit is being hit, and how to best
predict that limit on various cluster configurations.
Thanks,
john
Ted,
Thanks for link! I says 2.1.0 beta fix, and I can find FileSystem$Statistics
class in 2.2.0 but it only seems to talk about read/write ops and bytes, not
the local-vs-remote bytes. What am I missing?
John
From: Ted Yu [mailto:yuzhih...@gmail.com]
Sent: Sunday, January 26, 2014 10:26 AM
I have an application that wants to open a large set of files in HDFS
simultaneously. Are there hard or practical limits to what can be opened at
once by a single process? By the entire cluster in aggregate?
Thanks
John
Is there a way to monitor the proportion of HDFS read data that is satisfied by
local nodes vs going across the network?
Thanks
John
x27;d need to know more about your application.
john
From: rab ra [mailto:rab...@gmail.com]
Sent: Saturday, January 25, 2014 7:29 AM
To: user@hadoop.apache.org
Subject: RE: HDFS data transfer is faster than SCP based transfer?
The input files are provided as argument to a binary being executed b
There are no short-circuit writes, only reads, AFAIK.
Is it necessary to transfer from HDFS to local disk? Can you read from HDFS
directly using the FileSystem interface?
john
From: Shekhar Sharma [mailto:shekhar2...@gmail.com]
Sent: Saturday, January 25, 2014 3:44 AM
To: user@hadoop.apache.org
1 - 100 of 337 matches
Mail list logo