Re: [U2] STARTUP file issue with UV11.1 PE version (Linux)

2012-10-02 Thread Hona, David
The installation instructions of Rocket is quite good and does indeed mention 
the need to use cpio on UNIX servers. See Quick Installation and 
Step-by-step Instructions (of NEWINSTALL.PDF)...

However, the instructions from Rocket could be improved - with a minor 
revision, as the Installation guide assumes you're using a CD-ROM or tape drive 
to get the installation software on your system.

It is not clear that after you download the software archive file from the 
Internet, that you can upload it (as a binary file) to your UNIX host. Once 
there, un-zip (preferably as root) and then you must perform the cpio after 
that step.


-Original Message-
From: u2-users-boun...@listserver.u2ug.org 
[mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of doug chanco
Sent: Tuesday, 2 October 2012 3:43 AM
To: 'U2 Users List'
Subject: Re: [U2] STARTUP file issue with UV11.1 PE version (Linux)

No sir, I did not know that,  why would they cpio it anyway?  Not that it 
matters I was just curious, anyway thanks for the info.

Dougc

-Original Message-
From: u2-users-boun...@listserver.u2ug.org
[mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of Brian Leach
Sent: Monday, October 01, 2012 1:19 PM
To: 'U2 Users List'
Subject: Re: [U2] STARTUP file issue with UV11.1 PE version (Linux)

Doug

Have you remembered that STARTUP is a cpio archive?

# cpio -uvcdumB uv.load  STARTUP
./uv.load


-Original Message-
From: u2-users-boun...@listserver.u2ug.org
[mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of doug chanco
Sent: 01 October 2012 17:37
To: U2 Users List
Subject: [U2] STARTUP file issue with UV11.1 PE version (Linux)

I recently downloaded uv 11 and when I went to run STARTUP I got a weird error, 
upon looking at the STARTUP script I noticed it had a bunch of binary and other 
junk at the beginning of the file, I removed all the extra
stuff, saved the file and it ran just fine.

 

Has anyone else seen this?  I re downloaded the zip and still had this issue.  
It was easy enough to resolve but I thought I would mention it.

 

Dougc

 

 

 


** IMPORTANT MESSAGE *   
This e-mail message is intended only for the addressee(s) and contains 
information which may be
confidential. 
If you are not the intended recipient please advise the sender by return email, 
do not use or
disclose the contents, and delete the message and any attachments from your 
system. Unless
specifically indicated, this email does not constitute formal advice or 
commitment by the sender
or the Commonwealth Bank of Australia (ABN 48 123 123 124) or its subsidiaries. 
We can be contacted through our web site: commbank.com.au. 
If you no longer wish to receive commercial electronic messages from us, please 
reply to this
e-mail by typing Unsubscribe in the subject line. 
**



___
U2-Users mailing list
U2-Users@listserver.u2ug.org
http://listserver.u2ug.org/mailman/listinfo/u2-users


Re: [U2] [u2] Parallel processing in Universe

2012-10-02 Thread Symeon Breen
Oracle and sql server both use map reduce internally when doing collations
and totals. However they work differently to U2 in that they have one big
process that runs queries from the clients. This process can then cache,
multithread and map reduce. U2 is differently architected in that the client
process (uv or udt process) actually do the work and the central udt
processes are fairly slim. These client processes are single threaded. To do
any multi threading/multi processing is part of the application rather than
inherent in the database. 

One option is to make u2 a hadoop supported data store, you could then
mapreduce across multiple instances using whatever hadoop supporting toolset
you wanted.

However map reduce and hadoop are pretty horrible things. Even Google have
moved away from it with Caffiene etc.



-Original Message-
From: u2-users-boun...@listserver.u2ug.org
[mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of Wjhonson
Sent: 01 October 2012 21:05
To: u2-users@listserver.u2ug.org
Subject: [U2] [u2] Parallel processing in Universe


What's the largest dataset in the Universe user world?
In terms of number of records.

I'm wondering if we have any potential for utilities that map-reduce.
I suppose you would spawn phantoms but how do they communicate back to the
master node?
___
U2-Users mailing list
U2-Users@listserver.u2ug.org
http://listserver.u2ug.org/mailman/listinfo/u2-users

___
U2-Users mailing list
U2-Users@listserver.u2ug.org
http://listserver.u2ug.org/mailman/listinfo/u2-users


Re: [U2] STARTUP file issue with UV11.1 PE version (Linux)

2012-10-02 Thread Wols Lists
On 02/10/12 08:23, Hona, David wrote:
 The installation instructions of Rocket is quite good and does indeed mention 
 the need to use cpio on UNIX servers. See Quick Installation and 
 Step-by-step Instructions (of NEWINSTALL.PDF)...
 
 However, the instructions from Rocket could be improved - with a minor 
 revision, as the Installation guide assumes you're using a CD-ROM or tape 
 drive to get the installation software on your system.
 
 It is not clear that after you download the software archive file from the 
 Internet, that you can upload it (as a binary file) to your UNIX host. Once 
 there, un-zip (preferably as root) and then you must perform the cpio 
 after that step.

Does it also mention that the default cpio options have changed? Okay,
if you're installing on a non-supported linux why would Rocket worry,
but if you do get problems I think it's the -B option. Something to do
with block size, anyway. The default meaning has reversed so the script
isn't portable across different linuxen. All you have to do, though, is
add or remove the changed option.

Cheers,
Wol
___
U2-Users mailing list
U2-Users@listserver.u2ug.org
http://listserver.u2ug.org/mailman/listinfo/u2-users


Re: [U2] [u2] Parallel processing in Universe

2012-10-02 Thread Wols Lists
On 01/10/12 22:47, Robert Houben wrote:
 Create an index on a dict pointing at the first character of the key, and 
 have each phantom take two digits. (0-1, 2-3, 4-5, 6-7, 8-9)
 
Actually, this is a very BAD way of chopping up a file into five even
chunks.

I'm not sure of the stats, but on any file with sequential keys, the
first phantom will get the majority of the records, the second get the
majority of what's left, etc etc.

A lot of people make the mistake of thinking this is a good technique.
I'm not even sure it works well with random numbers...

Cheers,
Wol
___
U2-Users mailing list
U2-Users@listserver.u2ug.org
http://listserver.u2ug.org/mailman/listinfo/u2-users


Re: [U2] [u2] Parallel processing in Universe

2012-10-02 Thread Wols Lists
On 02/10/12 03:49, Ross Ferris wrote:
 If the file were big enough, and already had part files, then I believe that 
 you could have a phantom process each of the individual parts. Failing that, 
 get an SSD  relatively cheap, and will give your processing a reasonable 
 kick along!!
 
Just be careful with an SSD. If you have a power-fail in the middle of
your process this sounds just like the scenario that will trash it. As
in, totally dead no recovery possible.

SSDs are great, but a power fail during write can take out the
controller. One dead, irrecoverable disk. And if you're hammering the
i/o you are VERY vulnerable.

Cheers,
Wol
___
U2-Users mailing list
U2-Users@listserver.u2ug.org
http://listserver.u2ug.org/mailman/listinfo/u2-users


Re: [U2] [u2] Parallel processing in Universe

2012-10-02 Thread George Gallen
What about an striped array of SSD with a backup battery to flush the write 
buffer on power fail.
No more dangerous (IMO) than an array of hard drives - but given the limited 
write times of an SSD
That could be more of a danger, unless your using larger drives and not a lot 
of data so the drive
Has lot's of area to failover to when it reaches it's write maximum.

-Original Message-
From: u2-users-boun...@listserver.u2ug.org 
[mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of Wols Lists
Sent: Tuesday, October 02, 2012 4:20 AM
To: u2-users@listserver.u2ug.org
Subject: Re: [U2] [u2] Parallel processing in Universe

On 02/10/12 03:49, Ross Ferris wrote:
 If the file were big enough, and already had part files, then I believe that 
 you could have a phantom process each of the individual parts. Failing that, 
 get an SSD  relatively cheap, and will give your processing a reasonable 
 kick along!!
 
Just be careful with an SSD. If you have a power-fail in the middle of
your process this sounds just like the scenario that will trash it. As
in, totally dead no recovery possible.

SSDs are great, but a power fail during write can take out the
controller. One dead, irrecoverable disk. And if you're hammering the
i/o you are VERY vulnerable.

Cheers,
Wol
___
U2-Users mailing list
U2-Users@listserver.u2ug.org
http://listserver.u2ug.org/mailman/listinfo/u2-users
___
U2-Users mailing list
U2-Users@listserver.u2ug.org
http://listserver.u2ug.org/mailman/listinfo/u2-users


Re: [U2] [u2] Parallel processing in Universe (Unclassified)

2012-10-02 Thread Doug Averch
Only outside of U2 using UniObjects can you achieve any type of parallel
activity. We have through UniObjects got 80 processes working from a single
Eclipse session through the use of threads in Java.

UniObjects creates individual uvapi_slave or udapi_slave for each of these
processes but the system or in this case the udapi_server or uvadpi_server
cannot handle as many threads as we would like.  We never ran out of memory
on our 8GB Windows 2008R2 Server nor did SSD 120GB drive fail to keep up
with the 80 ANALYZE.FILES or the 80 RESIZE commands we were issuing on from
our XLr8Resizer product within Eclipse.

The only way we got this working was to set the retries to 1000 on
reopening the connections.  Although that number seems high it helped and
get us from our previous best of 39 process to 80 process. When we have a
lot of time and cannot think of anything better to do we will try for 500
process.

Regards,
Doug
www.u2logic.com
Eclipse based tools for the U2 programmer




 -Original Message-
 From: u2-users-boun...@listserver.u2ug.org [mailto:
 u2-users-boun...@listserver.u2ug.org] On Behalf Of HENDERSON MIKE, MR
 Sent: Tuesday, 2 October 2012 1:18 PM
 To: U2 Users List
 Subject: Re: [U2] [u2] Parallel processing in Universe (Unclassified)

 I have often thought about this - mostly in an idle moment or as a
 displacement activity for something less amusing that I ought to be doing.
 ;-)


 First of all, Universe is already extremely parallel: there's a separate
 O/S thread for each TTY and for each phantom, and you can't get more
 parallel than that for interactive processing.

 So you want more parallelism for your batch processes.
 Different applications have different degrees of inherent parallelism.
 For example in utility billing systems there is frequently the concept of
 a group of premises - based on the old concept of a foot-borne meter reader
 with a 'book' of readings to get. Each 'book' can be processed
 independently of every other. In payroll, each employee's record can be
 processed independently. Other areas of commerce have different
 characteristics.

 I think that whatever unit of parallelism you settle for, you'd need three
 processes: a 'dispatcher' that selects records for processing and queues
 them into some structure for processing; a set of 'workers' that take
 queued work items, process them, mark them as processed and put the results
 in some common store; and a 'monitor' that looks for unprocessed records
 and indications of stuck processes, and collates the results for final
 output.
 I've seen a couple of versions of this, one for electricity billings and
 another for overnight batch-processing of report requests, both well over a
 decade ago, and neither still in use although their underlying packages are
 still being run.

 The major issue is that these days the whole entity in the general
 commercial world is far more likely to be I/O limited than CPU limited, and
 therefore introducing parallelism will be no help at all if the I/O system
 is already choked.
 Even if the system is currently CPU-limited, multi-threading may not
 produce much improvement without very careful design of the record locking
 philosophy - introducing parallelism will be no help if all the threads end
 up contending serially for one record lock or a small set of locks.


 If you want it to go faster, buy the CPU with the fastest clock you can
 get (not the one with the most cores), and put your database on SSD like
 Ross said.
 The Power7+ chips being announced any day now are rumoured to go to
 5GHz+, maybe even more if you have half the cores on the chip disabled.


 Regards


 Mike

 -Original Message-
 From: u2-users-boun...@listserver.u2ug.org
 [mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of Ross Ferris
 Sent: Tuesday, 2 October 2012 3:50 p.m.
 To: U2 Users List
 Subject: Re: [U2] [u2] Parallel processing in Universe

 If the file were big enough, and already had part files, then I believe
 that you could have a phantom process each of the individual parts.
 Failing that, get an SSD  relatively cheap, and will give your
 processing a reasonable kick along!!

 Ross Ferris
 Stamina Software
 Visage  Better by Design!


 -Original Message-
 From: u2-users-boun...@listserver.u2ug.org
 [mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of David Wolverton
 Sent: Tuesday, 2 October 2012 8:47 AM
 To: 'U2 Users List'
 Subject: Re: [U2] [u2] Parallel processing in Universe

 OK - I was trying to create a 'smoother use' of the disk and 'read ahead'
 -- this example the disk would be chattering from the heads moving all over
 the place. I was trying to find a way to make this process more 'orderly'
 -- is there one?

 -Original Message-
 From: u2-users-boun...@listserver.u2ug.org
 [mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of Robert Houben
 Sent: Monday, October 01, 2012 4:48 PM
 To: U2 Users List
 Subject: Re: [U2] [u2] Parallel 

Re: [U2] [u2] Parallel processing in Universe

2012-10-02 Thread Wols Lists
On 02/10/12 15:28, George Gallen wrote:
 What about an striped array of SSD with a backup battery to flush the write 
 buffer on power fail.
 No more dangerous (IMO) than an array of hard drives - but given the limited 
 write times of an SSD
 That could be more of a danger, unless your using larger drives and not a lot 
 of data so the drive
 Has lot's of area to failover to when it reaches it's write maximum.

I guess a backup battery would save you. Basically, anything to prevent
power dying in the middle of a write. But the striped array would
probably simply mean several trashed drives instead of one. It's a
known, guaranteed, this is what will kill a drive scenario, and an
array would just mean more drives at risk.

The place I came across a major discussion about this (I knew of the
issue earlier) said that some combo of Windows, update, and a certain
laptop was notorious for writing off drives. The update would flood the
cache, then the laptop would suspend. Cue one dead drive and, if within
warranty, one no-quibble replacement.

Cheers,
Wol
 
 -Original Message-
 From: u2-users-boun...@listserver.u2ug.org 
 [mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of Wols Lists
 Sent: Tuesday, October 02, 2012 4:20 AM
 To: u2-users@listserver.u2ug.org
 Subject: Re: [U2] [u2] Parallel processing in Universe
 
 On 02/10/12 03:49, Ross Ferris wrote:
 If the file were big enough, and already had part files, then I believe that 
 you could have a phantom process each of the individual parts. Failing that, 
 get an SSD  relatively cheap, and will give your processing a reasonable 
 kick along!!

 Just be careful with an SSD. If you have a power-fail in the middle of
 your process this sounds just like the scenario that will trash it. As
 in, totally dead no recovery possible.
 
 SSDs are great, but a power fail during write can take out the
 controller. One dead, irrecoverable disk. And if you're hammering the
 i/o you are VERY vulnerable.
 
 Cheers,
 Wol
 ___
 U2-Users mailing list
 U2-Users@listserver.u2ug.org
 http://listserver.u2ug.org/mailman/listinfo/u2-users
 ___
 U2-Users mailing list
 U2-Users@listserver.u2ug.org
 http://listserver.u2ug.org/mailman/listinfo/u2-users
 

___
U2-Users mailing list
U2-Users@listserver.u2ug.org
http://listserver.u2ug.org/mailman/listinfo/u2-users


Re: [U2] [u2] Parallel processing in Universe (Unclassified)

2012-10-02 Thread Doug Averch
Only outside of U2 using UniObjects can you achieve any type of parallel
activity. We have through UniObjects got 80 processes working from a single
Eclipse session through the use of threads in Java.

UniObjects creates individual uvapi_slave or udapi_slave for each of these
processes but the system or in this case the udapi_server or uvadpi_server
cannot handle as many threads as we would like.  We never ran out of memory
on our 8GB Windows 2008R2 Server nor did SSD 120GB drive fail to keep up
with the 80 ANALYZE.FILES or the 80 RESIZE commands we were issuing on from
our XLr8Resizer product within Eclipse.

The only way we got this working was to set the retries to 1000 on
reopening the connections.  Although that number seems high it helped and
get us from our previous best of 39 process to 80 process. When we have a
lot of time and cannot think of anything better to do we will try for 500
process.

Regards,
Doug
www.u2logic.com/tools.html
Eclipse based tools for the U2 programmer
___
U2-Users mailing list
U2-Users@listserver.u2ug.org
http://listserver.u2ug.org/mailman/listinfo/u2-users


Re: [U2] [u2] Parallel processing in Universe

2012-10-02 Thread Wjhonson

Yes the low numbers are used more often.
However if you have sequential keys, just use the *last* two digits instead of 
the first two



-Original Message-
From: Wols Lists antli...@youngman.org.uk
To: u2-users u2-users@listserver.u2ug.org
Sent: Tue, Oct 2, 2012 1:17 am
Subject: Re: [U2] [u2] Parallel processing in Universe


On 01/10/12 22:47, Robert Houben wrote:
 Create an index on a dict pointing at the first character of the key, and 
 have 
each phantom take two digits. (0-1, 2-3, 4-5, 6-7, 8-9)
 
Actually, this is a very BAD way of chopping up a file into five even
chunks.

I'm not sure of the stats, but on any file with sequential keys, the
first phantom will get the majority of the records, the second get the
majority of what's left, etc etc.

A lot of people make the mistake of thinking this is a good technique.
I'm not even sure it works well with random numbers...

Cheers,
Wol
___
U2-Users mailing list
U2-Users@listserver.u2ug.org
http://listserver.u2ug.org/mailman/listinfo/u2-users

 
___
U2-Users mailing list
U2-Users@listserver.u2ug.org
http://listserver.u2ug.org/mailman/listinfo/u2-users


Re: [U2] [u2] Parallel processing in Universe

2012-10-02 Thread David Wolverton
In my example, I would grab 'whatever' records were hashed in the to 'group'
-- while it's not perfect since there are 'overflow' - was just trying to
think of a way to break a file into pieces that would otherwise process much
like a BASIC select - just grab the 'group' and go  I can see it's
probably not possible, but the topic got me thinking about 'what if'...
(And we're UniData - so I have to apply that filter to most everything I
read on the list anyway G)

-Original Message-
From: u2-users-boun...@listserver.u2ug.org
[mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of David Taylor
Sent: Monday, October 01, 2012 6:10 PM
To: U2 Users List
Subject: Re: [U2] [u2] Parallel processing in Universe

Or, let's suppose you wanted to process repetitive segments of one very
large record using the same logic in a separate phantom process for each
segment, how large a record can be read and processed in Universe?

Dave

 So how would a user 'chop up' a file for parallel processing?  
 Ideally, if here was a Mod 10001 file (or whatever) it would seem like 
 it would be 'ideal' to assign 2000 groups to 5 phantoms -- but I don't 
 know how 'start a BASIC select at Group 2001 or 4001' ...

 -Original Message-
 From: u2-users-boun...@listserver.u2ug.org
 [mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of George 
 Gallen
 Sent: Monday, October 01, 2012 3:29 PM
 To: U2 Users List
 Subject: Re: [U2] [u2] Parallel processing in Universe

 0001: OPENSEQ /tmp/pipetest TO F.PIPE ELSE STOP NO PIPE
 0002: LOOP
 0003:READSEQ LINE FROM F.PIPE ELSE CONTINUE
 0004:PRINT LINE
 0005: REPEAT
 0006: STOP
 0007: END

 Although, not sure if you might need to sleep a litte between the 
 READSEQ's ELSE and CONTINUE
Might suck up cpu time when nothing is writing to the file.

 Then you could setup a printer in UV that did a  cat -  /tmp/pipetest

 Now your phantom just needs to print to that printer.

 George

 -Original Message-
 From: u2-users-boun...@listserver.u2ug.org
 [mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of George 
 Gallen
 Sent: Monday, October 01, 2012 4:16 PM
 To: U2 Users List
 Subject: Re: [U2] [u2] Parallel processing in Universe

 The only thing about a pipe is that once it's closed, I believe it has 
 to be re-opened by both Ends again. So if point a opens one end, and 
 point b opens the other end, once either end closes, It closes for 
 both sides, and both sides would have to reopen again to use.

 To eliminate this, you could have one end open a file, and have the 
 other sides do a  append To that file - just make sure you include 
 some kind of dataheader so the reading side knows which Process just wrote
the data.

 -Original Message-
 From: u2-users-boun...@listserver.u2ug.org
 [mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of u2ug
 Sent: Monday, October 01, 2012 4:11 PM
 To: U2 Users List
 Subject: Re: [U2] [u2] Parallel processing in Universe

 pipes


 -Original Message-
 From: u2-users-boun...@listserver.u2ug.org
 [mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of Wjhonson
 Sent: Monday, October 01, 2012 4:05 PM
 To: u2-users@listserver.u2ug.org
 Subject: [U2] [u2] Parallel processing in Universe


 What's the largest dataset in the Universe user world?
 In terms of number of records.

 I'm wondering if we have any potential for utilities that map-reduce.
 I suppose you would spawn phantoms but how do they communicate back to 
 the master node?
 ___
 U2-Users mailing list
 U2-Users@listserver.u2ug.org
 http://listserver.u2ug.org/mailman/listinfo/u2-users


 ___
 U2-Users mailing list
 U2-Users@listserver.u2ug.org
 http://listserver.u2ug.org/mailman/listinfo/u2-users
 ___
 U2-Users mailing list
 U2-Users@listserver.u2ug.org
 http://listserver.u2ug.org/mailman/listinfo/u2-users
 ___
 U2-Users mailing list
 U2-Users@listserver.u2ug.org
 http://listserver.u2ug.org/mailman/listinfo/u2-users

 ___
 U2-Users mailing list
 U2-Users@listserver.u2ug.org
 http://listserver.u2ug.org/mailman/listinfo/u2-users



___
U2-Users mailing list
U2-Users@listserver.u2ug.org
http://listserver.u2ug.org/mailman/listinfo/u2-users

___
U2-Users mailing list
U2-Users@listserver.u2ug.org
http://listserver.u2ug.org/mailman/listinfo/u2-users


Re: [U2] [u2] Parallel processing in Universe

2012-10-02 Thread George Gallen
What if you created a duplicate file, did a SELECT and saved the list 
(non-sorted).

Each of the phantoms would do a getlist and loop through using readlist/readu 
 and if the record were already locked, skip it until it reads An unlocked 
record
 (and locks it). Delete the record when finished.



-Original Message-
From: u2-users-boun...@listserver.u2ug.org 
[mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of David Wolverton 
Sent: Tuesday, October 02, 2012 11:43 AM
To: 'U2 Users List'
Subject: Re: [U2] [u2] Parallel processing in Universe

In my example, I would grab 'whatever' records were hashed in the to 'group'
-- while it's not perfect since there are 'overflow' - was just trying to
think of a way to break a file into pieces that would otherwise process much
like a BASIC select - just grab the 'group' and go  I can see it's
probably not possible, but the topic got me thinking about 'what if'...
(And we're UniData - so I have to apply that filter to most everything I
read on the list anyway G)

-Original Message-
From: u2-users-boun...@listserver.u2ug.org
[mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of David Taylor
Sent: Monday, October 01, 2012 6:10 PM
To: U2 Users List
Subject: Re: [U2] [u2] Parallel processing in Universe

Or, let's suppose you wanted to process repetitive segments of one very
large record using the same logic in a separate phantom process for each
segment, how large a record can be read and processed in Universe?

Dave

 So how would a user 'chop up' a file for parallel processing?  
 Ideally, if here was a Mod 10001 file (or whatever) it would seem like 
 it would be 'ideal' to assign 2000 groups to 5 phantoms -- but I don't 
 know how 'start a BASIC select at Group 2001 or 4001' ...

 -Original Message-
 From: u2-users-boun...@listserver.u2ug.org
 [mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of George 
 Gallen
 Sent: Monday, October 01, 2012 3:29 PM
 To: U2 Users List
 Subject: Re: [U2] [u2] Parallel processing in Universe

 0001: OPENSEQ /tmp/pipetest TO F.PIPE ELSE STOP NO PIPE
 0002: LOOP
 0003:READSEQ LINE FROM F.PIPE ELSE CONTINUE
 0004:PRINT LINE
 0005: REPEAT
 0006: STOP
 0007: END

 Although, not sure if you might need to sleep a litte between the 
 READSEQ's ELSE and CONTINUE
Might suck up cpu time when nothing is writing to the file.

 Then you could setup a printer in UV that did a  cat -  /tmp/pipetest

 Now your phantom just needs to print to that printer.

 George

 -Original Message-
 From: u2-users-boun...@listserver.u2ug.org
 [mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of George 
 Gallen
 Sent: Monday, October 01, 2012 4:16 PM
 To: U2 Users List
 Subject: Re: [U2] [u2] Parallel processing in Universe

 The only thing about a pipe is that once it's closed, I believe it has 
 to be re-opened by both Ends again. So if point a opens one end, and 
 point b opens the other end, once either end closes, It closes for 
 both sides, and both sides would have to reopen again to use.

 To eliminate this, you could have one end open a file, and have the 
 other sides do a  append To that file - just make sure you include 
 some kind of dataheader so the reading side knows which Process just wrote
the data.

 -Original Message-
 From: u2-users-boun...@listserver.u2ug.org
 [mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of u2ug
 Sent: Monday, October 01, 2012 4:11 PM
 To: U2 Users List
 Subject: Re: [U2] [u2] Parallel processing in Universe

 pipes


 -Original Message-
 From: u2-users-boun...@listserver.u2ug.org
 [mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of Wjhonson
 Sent: Monday, October 01, 2012 4:05 PM
 To: u2-users@listserver.u2ug.org
 Subject: [U2] [u2] Parallel processing in Universe


 What's the largest dataset in the Universe user world?
 In terms of number of records.

 I'm wondering if we have any potential for utilities that map-reduce.
 I suppose you would spawn phantoms but how do they communicate back to 
 the master node?
 ___
 U2-Users mailing list
 U2-Users@listserver.u2ug.org
 http://listserver.u2ug.org/mailman/listinfo/u2-users


 ___
 U2-Users mailing list
 U2-Users@listserver.u2ug.org
 http://listserver.u2ug.org/mailman/listinfo/u2-users
 ___
 U2-Users mailing list
 U2-Users@listserver.u2ug.org
 http://listserver.u2ug.org/mailman/listinfo/u2-users
 ___
 U2-Users mailing list
 U2-Users@listserver.u2ug.org
 http://listserver.u2ug.org/mailman/listinfo/u2-users

 ___
 U2-Users mailing list
 U2-Users@listserver.u2ug.org
 http://listserver.u2ug.org/mailman/listinfo/u2-users



___
U2-Users mailing list
U2-Users@listserver.u2ug.org

Re: [U2] [u2] Parallel processing in Universe

2012-10-02 Thread David Wolverton
AH - would not even have to 'delete' as long as the 'locks' are held long
enough -- meaning if you know you will have 20 phantoms, each phantom would
keep a list of 'keys locked' and once it hits 21 (or 40 if you want
insurance LOL) in the list, would unlock earliest lock -- that way there is
no way any other phantom could process anything twice...

As each phantom runs, if it hits a locked record, it would move to the next
item in the list.

Great idea!

DW

-Original Message-
From: u2-users-boun...@listserver.u2ug.org
[mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of George Gallen
Sent: Tuesday, October 02, 2012 10:52 AM
To: U2 Users List
Subject: Re: [U2] [u2] Parallel processing in Universe

What if you created a duplicate file, did a SELECT and saved the list
(non-sorted).

Each of the phantoms would do a getlist and loop through using
readlist/readu  and if the record were already locked, skip it until it
reads An unlocked record  (and locks it). Delete the record when finished.



-Original Message-
From: u2-users-boun...@listserver.u2ug.org
[mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of David Wolverton 
Sent: Tuesday, October 02, 2012 11:43 AM
To: 'U2 Users List'
Subject: Re: [U2] [u2] Parallel processing in Universe

In my example, I would grab 'whatever' records were hashed in the to 'group'
-- while it's not perfect since there are 'overflow' - was just trying to
think of a way to break a file into pieces that would otherwise process much
like a BASIC select - just grab the 'group' and go  I can see it's
probably not possible, but the topic got me thinking about 'what if'...
(And we're UniData - so I have to apply that filter to most everything I
read on the list anyway G)

-Original Message-
From: u2-users-boun...@listserver.u2ug.org
[mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of David Taylor
Sent: Monday, October 01, 2012 6:10 PM
To: U2 Users List
Subject: Re: [U2] [u2] Parallel processing in Universe

Or, let's suppose you wanted to process repetitive segments of one very
large record using the same logic in a separate phantom process for each
segment, how large a record can be read and processed in Universe?

Dave

 So how would a user 'chop up' a file for parallel processing?  
 Ideally, if here was a Mod 10001 file (or whatever) it would seem like 
 it would be 'ideal' to assign 2000 groups to 5 phantoms -- but I don't 
 know how 'start a BASIC select at Group 2001 or 4001' ...

 -Original Message-
 From: u2-users-boun...@listserver.u2ug.org
 [mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of George 
 Gallen
 Sent: Monday, October 01, 2012 3:29 PM
 To: U2 Users List
 Subject: Re: [U2] [u2] Parallel processing in Universe

 0001: OPENSEQ /tmp/pipetest TO F.PIPE ELSE STOP NO PIPE
 0002: LOOP
 0003:READSEQ LINE FROM F.PIPE ELSE CONTINUE
 0004:PRINT LINE
 0005: REPEAT
 0006: STOP
 0007: END

 Although, not sure if you might need to sleep a litte between the 
 READSEQ's ELSE and CONTINUE
Might suck up cpu time when nothing is writing to the file.

 Then you could setup a printer in UV that did a  cat -  /tmp/pipetest

 Now your phantom just needs to print to that printer.

 George

 -Original Message-
 From: u2-users-boun...@listserver.u2ug.org
 [mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of George 
 Gallen
 Sent: Monday, October 01, 2012 4:16 PM
 To: U2 Users List
 Subject: Re: [U2] [u2] Parallel processing in Universe

 The only thing about a pipe is that once it's closed, I believe it has 
 to be re-opened by both Ends again. So if point a opens one end, and 
 point b opens the other end, once either end closes, It closes for 
 both sides, and both sides would have to reopen again to use.

 To eliminate this, you could have one end open a file, and have the 
 other sides do a  append To that file - just make sure you include 
 some kind of dataheader so the reading side knows which Process just wrote
the data.

 -Original Message-
 From: u2-users-boun...@listserver.u2ug.org
 [mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of u2ug
 Sent: Monday, October 01, 2012 4:11 PM
 To: U2 Users List
 Subject: Re: [U2] [u2] Parallel processing in Universe

 pipes


 -Original Message-
 From: u2-users-boun...@listserver.u2ug.org
 [mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of Wjhonson
 Sent: Monday, October 01, 2012 4:05 PM
 To: u2-users@listserver.u2ug.org
 Subject: [U2] [u2] Parallel processing in Universe


 What's the largest dataset in the Universe user world?
 In terms of number of records.

 I'm wondering if we have any potential for utilities that map-reduce.
 I suppose you would spawn phantoms but how do they communicate back to 
 the master node?
 ___
 U2-Users mailing list
 U2-Users@listserver.u2ug.org
 http://listserver.u2ug.org/mailman/listinfo/u2-users


 ___
 

Re: [U2] [u2] Parallel processing in Universe

2012-10-02 Thread Daniel McGrath
You've highlighted one problem here.

By having multiple processes accessing the disk in different locations, you 
destroy cache optimization and seek times. More phantoms = less performance. 
This assumes I/O is a bigger concern than CPU, which is generally the case.

More phantoms = more communication, which also adds another overhead that 
reduces performance.

Introducing more phantoms than CPU cores, you increase the amount of context 
switching, which ones again hurts your cache usage as well as adding bigger 
overheads on the CPU again.

In short, except for very specific cases, increasing 'concurrency' through 
phantoms on a single machine is generally ill-advised, resulting in longer 
processing times, higher average system loads and worse yet, greater system 
complexity (and hence ways for things to break).

As mentioned earlier, more system-level architectural changes (such as multiple 
machines, or at least, files storage on different disks/spindles for each 
process) are required if you want benefit from this sort of work.


-Original Message-
From: u2-users-boun...@listserver.u2ug.org 
[mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of David Wolverton 
Sent: Monday, October 01, 2012 4:47 PM
To: 'U2 Users List'
Subject: Re: [U2] [u2] Parallel processing in Universe

OK - I was trying to create a 'smoother use' of the disk and 'read ahead' -- 
this example the disk would be chattering from the heads moving all over the 
place. I was trying to find a way to make this process more 'orderly' -- is 
there one?

-Original Message-
From: u2-users-boun...@listserver.u2ug.org
[mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of Robert Houben
Sent: Monday, October 01, 2012 4:48 PM
To: U2 Users List
Subject: Re: [U2] [u2] Parallel processing in Universe

Create an index on a dict pointing at the first character of the key, and have 
each phantom take two digits. (0-1, 2-3, 4-5, 6-7, 8-9)

-Original Message-
From: u2-users-boun...@listserver.u2ug.org
[mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of David Wolverton
Sent: October-01-12 2:43 PM
To: 'U2 Users List'
Subject: Re: [U2] [u2] Parallel processing in Universe

So how would a user 'chop up' a file for parallel processing?  Ideally, if here 
was a Mod 10001 file (or whatever) it would seem like it would be 'ideal' to 
assign 2000 groups to 5 phantoms -- but I don't know how 'start a BASIC select 
at Group 2001 or 4001' ...

-Original Message-
From: u2-users-boun...@listserver.u2ug.org
[mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of George Gallen
Sent: Monday, October 01, 2012 3:29 PM
To: U2 Users List
Subject: Re: [U2] [u2] Parallel processing in Universe

0001: OPENSEQ /tmp/pipetest TO F.PIPE ELSE STOP NO PIPE
0002: LOOP
0003:READSEQ LINE FROM F.PIPE ELSE CONTINUE
0004:PRINT LINE
0005: REPEAT
0006: STOP
0007: END

Although, not sure if you might need to sleep a litte between the READSEQ's 
ELSE and CONTINUE
   Might suck up cpu time when nothing is writing to the file.

Then you could setup a printer in UV that did a  cat -  /tmp/pipetest

Now your phantom just needs to print to that printer.

George

-Original Message-
From: u2-users-boun...@listserver.u2ug.org
[mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of George Gallen
Sent: Monday, October 01, 2012 4:16 PM
To: U2 Users List
Subject: Re: [U2] [u2] Parallel processing in Universe

The only thing about a pipe is that once it's closed, I believe it has to be 
re-opened by both Ends again. So if point a opens one end, and point b opens 
the other end, once either end closes, It closes for both sides, and both sides 
would have to reopen again to use.

To eliminate this, you could have one end open a file, and have the other sides 
do a  append To that file - just make sure you include some kind of 
dataheader so the reading side knows which Process just wrote the data.

-Original Message-
From: u2-users-boun...@listserver.u2ug.org
[mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of u2ug
Sent: Monday, October 01, 2012 4:11 PM
To: U2 Users List
Subject: Re: [U2] [u2] Parallel processing in Universe

pipes


-Original Message-
From: u2-users-boun...@listserver.u2ug.org
[mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of Wjhonson
Sent: Monday, October 01, 2012 4:05 PM
To: u2-users@listserver.u2ug.org
Subject: [U2] [u2] Parallel processing in Universe


What's the largest dataset in the Universe user world?
In terms of number of records.

I'm wondering if we have any potential for utilities that map-reduce.
I suppose you would spawn phantoms but how do they communicate back to the 
master node?
___
U2-Users mailing list
U2-Users@listserver.u2ug.org
http://listserver.u2ug.org/mailman/listinfo/u2-users


___
U2-Users mailing list
U2-Users@listserver.u2ug.org

Re: [U2] [u2] Parallel processing in Universe

2012-10-02 Thread David Wolverton
Great point!!  I think we can agree that 'spinning media latency' is the
enemy and having phantoms increasing the 'head dance' can make things worse,
not better!  

Many problems go away or become trivial as the spinning media trails to the
sunset.  I've advised customers that just moving 'code files' to a tiny SSD
would likely increase overall system performance on Windows boxes.  Just
waiting until the price for Enterprise SSDs makes them a no-brainer...
Until then, even small SSDs will help!



-Original Message-
From: u2-users-boun...@listserver.u2ug.org
[mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of Daniel McGrath
Sent: Tuesday, October 02, 2012 12:05 PM
To: U2 Users List
Subject: Re: [U2] [u2] Parallel processing in Universe

You've highlighted one problem here.

By having multiple processes accessing the disk in different locations, you
destroy cache optimization and seek times. More phantoms = less performance.
This assumes I/O is a bigger concern than CPU, which is generally the case.

More phantoms = more communication, which also adds another overhead that
reduces performance.

Introducing more phantoms than CPU cores, you increase the amount of context
switching, which ones again hurts your cache usage as well as adding bigger
overheads on the CPU again.

In short, except for very specific cases, increasing 'concurrency' through
phantoms on a single machine is generally ill-advised, resulting in longer
processing times, higher average system loads and worse yet, greater system
complexity (and hence ways for things to break).

As mentioned earlier, more system-level architectural changes (such as
multiple machines, or at least, files storage on different disks/spindles
for each process) are required if you want benefit from this sort of work.


-Original Message-
From: u2-users-boun...@listserver.u2ug.org
[mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of David Wolverton 
Sent: Monday, October 01, 2012 4:47 PM
To: 'U2 Users List'
Subject: Re: [U2] [u2] Parallel processing in Universe

OK - I was trying to create a 'smoother use' of the disk and 'read ahead' --
this example the disk would be chattering from the heads moving all over the
place. I was trying to find a way to make this process more 'orderly' -- is
there one?

-Original Message-
From: u2-users-boun...@listserver.u2ug.org
[mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of Robert Houben
Sent: Monday, October 01, 2012 4:48 PM
To: U2 Users List
Subject: Re: [U2] [u2] Parallel processing in Universe

Create an index on a dict pointing at the first character of the key, and
have each phantom take two digits. (0-1, 2-3, 4-5, 6-7, 8-9)

-Original Message-
From: u2-users-boun...@listserver.u2ug.org
[mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of David Wolverton
Sent: October-01-12 2:43 PM
To: 'U2 Users List'
Subject: Re: [U2] [u2] Parallel processing in Universe

So how would a user 'chop up' a file for parallel processing?  Ideally, if
here was a Mod 10001 file (or whatever) it would seem like it would be
'ideal' to assign 2000 groups to 5 phantoms -- but I don't know how 'start a
BASIC select at Group 2001 or 4001' ...

-Original Message-
From: u2-users-boun...@listserver.u2ug.org
[mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of George Gallen
Sent: Monday, October 01, 2012 3:29 PM
To: U2 Users List
Subject: Re: [U2] [u2] Parallel processing in Universe

0001: OPENSEQ /tmp/pipetest TO F.PIPE ELSE STOP NO PIPE
0002: LOOP
0003:READSEQ LINE FROM F.PIPE ELSE CONTINUE
0004:PRINT LINE
0005: REPEAT
0006: STOP
0007: END

Although, not sure if you might need to sleep a litte between the READSEQ's
ELSE and CONTINUE
   Might suck up cpu time when nothing is writing to the file.

Then you could setup a printer in UV that did a  cat -  /tmp/pipetest

Now your phantom just needs to print to that printer.

George

-Original Message-
From: u2-users-boun...@listserver.u2ug.org
[mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of George Gallen
Sent: Monday, October 01, 2012 4:16 PM
To: U2 Users List
Subject: Re: [U2] [u2] Parallel processing in Universe

The only thing about a pipe is that once it's closed, I believe it has to be
re-opened by both Ends again. So if point a opens one end, and point b opens
the other end, once either end closes, It closes for both sides, and both
sides would have to reopen again to use.

To eliminate this, you could have one end open a file, and have the other
sides do a  append To that file - just make sure you include some kind
of dataheader so the reading side knows which Process just wrote the data.

-Original Message-
From: u2-users-boun...@listserver.u2ug.org
[mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of u2ug
Sent: Monday, October 01, 2012 4:11 PM
To: U2 Users List
Subject: Re: [U2] [u2] Parallel processing in Universe

pipes


-Original Message-
From: 

Re: [U2] [u2] Parallel processing in Universe

2012-10-02 Thread Daniel McGrath
Yes, SSD will definitely help. Just keep in mind, it doesn't prevent all 
negatives in regards to I/O, particularly with regards to caching.

Disk caching in a modern system is fairly complex, but at the high level it is 
not only done by the controller, but by the OS as well. So randomly flying 
around the disk still cause cache thrashing. :(


-Original Message-
From: u2-users-boun...@listserver.u2ug.org 
[mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of David Wolverton 
Sent: Tuesday, October 02, 2012 11:19 AM
To: 'U2 Users List'
Subject: Re: [U2] [u2] Parallel processing in Universe

Great point!!  I think we can agree that 'spinning media latency' is the enemy 
and having phantoms increasing the 'head dance' can make things worse, not 
better!  

Many problems go away or become trivial as the spinning media trails to the 
sunset.  I've advised customers that just moving 'code files' to a tiny SSD 
would likely increase overall system performance on Windows boxes.  Just 
waiting until the price for Enterprise SSDs makes them a no-brainer...
Until then, even small SSDs will help!



-Original Message-
From: u2-users-boun...@listserver.u2ug.org
[mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of Daniel McGrath
Sent: Tuesday, October 02, 2012 12:05 PM
To: U2 Users List
Subject: Re: [U2] [u2] Parallel processing in Universe

You've highlighted one problem here.

By having multiple processes accessing the disk in different locations, you 
destroy cache optimization and seek times. More phantoms = less performance.
This assumes I/O is a bigger concern than CPU, which is generally the case.

More phantoms = more communication, which also adds another overhead that 
reduces performance.

Introducing more phantoms than CPU cores, you increase the amount of context 
switching, which ones again hurts your cache usage as well as adding bigger 
overheads on the CPU again.

In short, except for very specific cases, increasing 'concurrency' through 
phantoms on a single machine is generally ill-advised, resulting in longer 
processing times, higher average system loads and worse yet, greater system 
complexity (and hence ways for things to break).

As mentioned earlier, more system-level architectural changes (such as multiple 
machines, or at least, files storage on different disks/spindles for each 
process) are required if you want benefit from this sort of work.


-Original Message-
From: u2-users-boun...@listserver.u2ug.org
[mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of David Wolverton
Sent: Monday, October 01, 2012 4:47 PM
To: 'U2 Users List'
Subject: Re: [U2] [u2] Parallel processing in Universe

OK - I was trying to create a 'smoother use' of the disk and 'read ahead' -- 
this example the disk would be chattering from the heads moving all over the 
place. I was trying to find a way to make this process more 'orderly' -- is 
there one?

-Original Message-
From: u2-users-boun...@listserver.u2ug.org
[mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of Robert Houben
Sent: Monday, October 01, 2012 4:48 PM
To: U2 Users List
Subject: Re: [U2] [u2] Parallel processing in Universe

Create an index on a dict pointing at the first character of the key, and have 
each phantom take two digits. (0-1, 2-3, 4-5, 6-7, 8-9)

-Original Message-
From: u2-users-boun...@listserver.u2ug.org
[mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of David Wolverton
Sent: October-01-12 2:43 PM
To: 'U2 Users List'
Subject: Re: [U2] [u2] Parallel processing in Universe

So how would a user 'chop up' a file for parallel processing?  Ideally, if here 
was a Mod 10001 file (or whatever) it would seem like it would be 'ideal' to 
assign 2000 groups to 5 phantoms -- but I don't know how 'start a BASIC select 
at Group 2001 or 4001' ...

-Original Message-
From: u2-users-boun...@listserver.u2ug.org
[mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of George Gallen
Sent: Monday, October 01, 2012 3:29 PM
To: U2 Users List
Subject: Re: [U2] [u2] Parallel processing in Universe

0001: OPENSEQ /tmp/pipetest TO F.PIPE ELSE STOP NO PIPE
0002: LOOP
0003:READSEQ LINE FROM F.PIPE ELSE CONTINUE
0004:PRINT LINE
0005: REPEAT
0006: STOP
0007: END

Although, not sure if you might need to sleep a litte between the READSEQ's 
ELSE and CONTINUE
   Might suck up cpu time when nothing is writing to the file.

Then you could setup a printer in UV that did a  cat -  /tmp/pipetest

Now your phantom just needs to print to that printer.

George

-Original Message-
From: u2-users-boun...@listserver.u2ug.org
[mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of George Gallen
Sent: Monday, October 01, 2012 4:16 PM
To: U2 Users List
Subject: Re: [U2] [u2] Parallel processing in Universe

The only thing about a pipe is that once it's closed, I believe it has to be 
re-opened by both Ends again. So if point a opens one end, and point b opens 
the other end, once 

Re: [U2] [u2] Parallel processing in Universe

2012-10-02 Thread Wjhonson

The idea of the phantoms would be to read the file in order, not randomly, just 
inorder from five different starting points.
So you should still get the benefit of some caching.



-Original Message-
From: Daniel McGrath dmcgr...@rocketsoftware.com
To: U2 Users List u2-users@listserver.u2ug.org
Sent: Tue, Oct 2, 2012 10:32 am
Subject: Re: [U2] [u2] Parallel processing in Universe


Yes, SSD will definitely help. Just keep in mind, it doesn't prevent all 
negatives in regards to I/O, particularly with regards to caching.

Disk caching in a modern system is fairly complex, but at the high level it is 
not only done by the controller, but by the OS as well. So randomly flying 
around the disk still cause cache thrashing. :(


-Original Message-
From: u2-users-boun...@listserver.u2ug.org 
[mailto:u2-users-boun...@listserver.u2ug.org] 
On Behalf Of David Wolverton 
Sent: Tuesday, October 02, 2012 11:19 AM
To: 'U2 Users List'
Subject: Re: [U2] [u2] Parallel processing in Universe

Great point!!  I think we can agree that 'spinning media latency' is the enemy 
and having phantoms increasing the 'head dance' can make things worse, not 
better!  

Many problems go away or become trivial as the spinning media trails to the 
sunset.  I've advised customers that just moving 'code files' to a tiny SSD 
would likely increase overall system performance on Windows boxes.  Just 
waiting 
until the price for Enterprise SSDs makes them a no-brainer...
Until then, even small SSDs will help!



-Original Message-
From: u2-users-boun...@listserver.u2ug.org
[mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of Daniel McGrath
Sent: Tuesday, October 02, 2012 12:05 PM
To: U2 Users List
Subject: Re: [U2] [u2] Parallel processing in Universe

You've highlighted one problem here.

By having multiple processes accessing the disk in different locations, you 
destroy cache optimization and seek times. More phantoms = less performance.
This assumes I/O is a bigger concern than CPU, which is generally the case.

More phantoms = more communication, which also adds another overhead that 
reduces performance.

Introducing more phantoms than CPU cores, you increase the amount of context 
switching, which ones again hurts your cache usage as well as adding bigger 
overheads on the CPU again.

In short, except for very specific cases, increasing 'concurrency' through 
phantoms on a single machine is generally ill-advised, resulting in longer 
processing times, higher average system loads and worse yet, greater system 
complexity (and hence ways for things to break).

As mentioned earlier, more system-level architectural changes (such as multiple 
machines, or at least, files storage on different disks/spindles for each 
process) are required if you want benefit from this sort of work.


-Original Message-
From: u2-users-boun...@listserver.u2ug.org
[mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of David Wolverton
Sent: Monday, October 01, 2012 4:47 PM
To: 'U2 Users List'
Subject: Re: [U2] [u2] Parallel processing in Universe

OK - I was trying to create a 'smoother use' of the disk and 'read ahead' -- 
this example the disk would be chattering from the heads moving all over the 
place. I was trying to find a way to make this process more 'orderly' -- is 
there one?

-Original Message-
From: u2-users-boun...@listserver.u2ug.org
[mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of Robert Houben
Sent: Monday, October 01, 2012 4:48 PM
To: U2 Users List
Subject: Re: [U2] [u2] Parallel processing in Universe

Create an index on a dict pointing at the first character of the key, and have 
each phantom take two digits. (0-1, 2-3, 4-5, 6-7, 8-9)

-Original Message-
From: u2-users-boun...@listserver.u2ug.org
[mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of David Wolverton
Sent: October-01-12 2:43 PM
To: 'U2 Users List'
Subject: Re: [U2] [u2] Parallel processing in Universe

So how would a user 'chop up' a file for parallel processing?  Ideally, if here 
was a Mod 10001 file (or whatever) it would seem like it would be 'ideal' to 
assign 2000 groups to 5 phantoms -- but I don't know how 'start a BASIC select 
at Group 2001 or 4001' ...

-Original Message-
From: u2-users-boun...@listserver.u2ug.org
[mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of George Gallen
Sent: Monday, October 01, 2012 3:29 PM
To: U2 Users List
Subject: Re: [U2] [u2] Parallel processing in Universe

0001: OPENSEQ /tmp/pipetest TO F.PIPE ELSE STOP NO PIPE
0002: LOOP
0003:READSEQ LINE FROM F.PIPE ELSE CONTINUE
0004:PRINT LINE
0005: REPEAT
0006: STOP
0007: END

Although, not sure if you might need to sleep a litte between the READSEQ's 
ELSE 
and CONTINUE
   Might suck up cpu time when nothing is writing to the file.

Then you could setup a printer in UV that did a  cat -  /tmp/pipetest

Now your phantom just needs to print to that printer.

George

-Original Message-
From: 

Re: [U2] [u2] Parallel processing in Universe

2012-10-02 Thread George Gallen
If 5 phantoms were running, and read in order but from 5 different starting 
points, the records would
Essentially still be processed in a random order, if you were to layout the 
record ID's as they get
Processed.

George

-Original Message-
From: u2-users-boun...@listserver.u2ug.org 
[mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of Wjhonson
Sent: Tuesday, October 02, 2012 1:35 PM
To: u2-users@listserver.u2ug.org
Subject: Re: [U2] [u2] Parallel processing in Universe


The idea of the phantoms would be to read the file in order, not randomly, just 
inorder from five different starting points.
So you should still get the benefit of some caching.


___
U2-Users mailing list
U2-Users@listserver.u2ug.org
http://listserver.u2ug.org/mailman/listinfo/u2-users


Re: [U2] [u2] Parallel processing in Universe

2012-10-02 Thread Wjhonson

The point of the caching concern is related to the read ahead, and you will 
still get some benefit from this, if your five phantoms are reading their 
*portion* of the file in order, which they should.



-Original Message-
From: George Gallen ggal...@wyanokegroup.com
To: U2 Users List u2-users@listserver.u2ug.org
Sent: Tue, Oct 2, 2012 10:39 am
Subject: Re: [U2] [u2] Parallel processing in Universe


If 5 phantoms were running, and read in order but from 5 different starting 
points, the records would
Essentially still be processed in a random order, if you were to layout the 
record ID's as they get
Processed.

George

-Original Message-
From: u2-users-boun...@listserver.u2ug.org 
[mailto:u2-users-boun...@listserver.u2ug.org] 
On Behalf Of Wjhonson
Sent: Tuesday, October 02, 2012 1:35 PM
To: u2-users@listserver.u2ug.org
Subject: Re: [U2] [u2] Parallel processing in Universe


The idea of the phantoms would be to read the file in order, not randomly, just 
inorder from five different starting points.
So you should still get the benefit of some caching.


___
U2-Users mailing list
U2-Users@listserver.u2ug.org
http://listserver.u2ug.org/mailman/listinfo/u2-users

 
___
U2-Users mailing list
U2-Users@listserver.u2ug.org
http://listserver.u2ug.org/mailman/listinfo/u2-users


Re: [U2] [u2] Parallel processing in Universe

2012-10-02 Thread George Gallen
OK. I See what your saying...I'll buy that.

-Original Message-
From: u2-users-boun...@listserver.u2ug.org 
[mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of Wjhonson
Sent: Tuesday, October 02, 2012 1:42 PM
To: u2-users@listserver.u2ug.org
Subject: Re: [U2] [u2] Parallel processing in Universe


The point of the caching concern is related to the read ahead, and you will 
still get some benefit from this, if your five phantoms are reading their 
*portion* of the file in order, which they should.



-Original Message-
From: George Gallen ggal...@wyanokegroup.com
To: U2 Users List u2-users@listserver.u2ug.org
Sent: Tue, Oct 2, 2012 10:39 am
Subject: Re: [U2] [u2] Parallel processing in Universe


If 5 phantoms were running, and read in order but from 5 different starting 
points, the records would
Essentially still be processed in a random order, if you were to layout the 
record ID's as they get
Processed.

George

-Original Message-
From: u2-users-boun...@listserver.u2ug.org 
[mailto:u2-users-boun...@listserver.u2ug.org] 
On Behalf Of Wjhonson
Sent: Tuesday, October 02, 2012 1:35 PM
To: u2-users@listserver.u2ug.org
Subject: Re: [U2] [u2] Parallel processing in Universe


The idea of the phantoms would be to read the file in order, not randomly, just 
inorder from five different starting points.
So you should still get the benefit of some caching.


___
U2-Users mailing list
U2-Users@listserver.u2ug.org
http://listserver.u2ug.org/mailman/listinfo/u2-users

 
___
U2-Users mailing list
U2-Users@listserver.u2ug.org
http://listserver.u2ug.org/mailman/listinfo/u2-users
___
U2-Users mailing list
U2-Users@listserver.u2ug.org
http://listserver.u2ug.org/mailman/listinfo/u2-users


Re: [U2] [u2] Parallel processing in Universe

2012-10-02 Thread David Wolverton
Which was my question -- was there a way to 'jump to' a group or 'BASIC
SELECT' with a'starting/ending' group -- so that again, 10001 moduo, one
phantom does 'groups' 1-2000, next phantom does 'groups' 2001-4000 etc...
But can't see that it's really possible without jumping through hoops that
make it unattractive at best!  At least on UniData!

DW

-Original Message-
From: u2-users-boun...@listserver.u2ug.org
[mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of George Gallen
Sent: Tuesday, October 02, 2012 12:55 PM
To: U2 Users List
Subject: Re: [U2] [u2] Parallel processing in Universe

OK. I See what your saying...I'll buy that.

-Original Message-
From: u2-users-boun...@listserver.u2ug.org
[mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of Wjhonson
Sent: Tuesday, October 02, 2012 1:42 PM
To: u2-users@listserver.u2ug.org
Subject: Re: [U2] [u2] Parallel processing in Universe


The point of the caching concern is related to the read ahead, and you will
still get some benefit from this, if your five phantoms are reading their
*portion* of the file in order, which they should.



-Original Message-
From: George Gallen ggal...@wyanokegroup.com
To: U2 Users List u2-users@listserver.u2ug.org
Sent: Tue, Oct 2, 2012 10:39 am
Subject: Re: [U2] [u2] Parallel processing in Universe


If 5 phantoms were running, and read in order but from 5 different starting
points, the records would Essentially still be processed in a random order,
if you were to layout the record ID's as they get Processed.

George

-Original Message-
From: u2-users-boun...@listserver.u2ug.org
[mailto:u2-users-boun...@listserver.u2ug.org]
On Behalf Of Wjhonson
Sent: Tuesday, October 02, 2012 1:35 PM
To: u2-users@listserver.u2ug.org
Subject: Re: [U2] [u2] Parallel processing in Universe


The idea of the phantoms would be to read the file in order, not randomly,
just 
inorder from five different starting points.
So you should still get the benefit of some caching.


___
U2-Users mailing list
U2-Users@listserver.u2ug.org
http://listserver.u2ug.org/mailman/listinfo/u2-users

 
___
U2-Users mailing list
U2-Users@listserver.u2ug.org
http://listserver.u2ug.org/mailman/listinfo/u2-users
___
U2-Users mailing list
U2-Users@listserver.u2ug.org
http://listserver.u2ug.org/mailman/listinfo/u2-users

___
U2-Users mailing list
U2-Users@listserver.u2ug.org
http://listserver.u2ug.org/mailman/listinfo/u2-users


Re: [U2] [u2] Parallel processing in Universe

2012-10-02 Thread Wjhonson

You may not need to know what *group* you are in per se, if you are willing to 
use the file stats record.
You can determine from the last stats, how many records are in your file.

Then your master program just reads the keys until it gets to the 50,000th key 
(or whatever), and then spawns a phantom, telling it with which key to start, 
and how many keys to process before it ends.

Or maybe you don't need the stat file if Unidata has the @SELECTED to tell you 
how many keys there are


-Original Message-
From: David Wolverton dwolv...@flash.net
To: 'U2 Users List' u2-users@listserver.u2ug.org
Sent: Tue, Oct 2, 2012 10:59 am
Subject: Re: [U2] [u2] Parallel processing in Universe


Which was my question -- was there a way to 'jump to' a group or 'BASIC
SELECT' with a'starting/ending' group -- so that again, 10001 moduo, one
phantom does 'groups' 1-2000, next phantom does 'groups' 2001-4000 etc...
But can't see that it's really possible without jumping through hoops that
make it unattractive at best!  At least on UniData!

DW

-Original Message-
From: u2-users-boun...@listserver.u2ug.org
[mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of George Gallen
Sent: Tuesday, October 02, 2012 12:55 PM
To: U2 Users List
Subject: Re: [U2] [u2] Parallel processing in Universe

OK. I See what your saying...I'll buy that.

-Original Message-
From: u2-users-boun...@listserver.u2ug.org
[mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of Wjhonson
Sent: Tuesday, October 02, 2012 1:42 PM
To: u2-users@listserver.u2ug.org
Subject: Re: [U2] [u2] Parallel processing in Universe


The point of the caching concern is related to the read ahead, and you will
still get some benefit from this, if your five phantoms are reading their
*portion* of the file in order, which they should.



-Original Message-
From: George Gallen ggal...@wyanokegroup.com
To: U2 Users List u2-users@listserver.u2ug.org
Sent: Tue, Oct 2, 2012 10:39 am
Subject: Re: [U2] [u2] Parallel processing in Universe


If 5 phantoms were running, and read in order but from 5 different starting
points, the records would Essentially still be processed in a random order,
if you were to layout the record ID's as they get Processed.

George

-Original Message-
From: u2-users-boun...@listserver.u2ug.org
[mailto:u2-users-boun...@listserver.u2ug.org]
On Behalf Of Wjhonson
Sent: Tuesday, October 02, 2012 1:35 PM
To: u2-users@listserver.u2ug.org
Subject: Re: [U2] [u2] Parallel processing in Universe


The idea of the phantoms would be to read the file in order, not randomly,
just 
inorder from five different starting points.
So you should still get the benefit of some caching.


___
U2-Users mailing list
U2-Users@listserver.u2ug.org
http://listserver.u2ug.org/mailman/listinfo/u2-users

 
___
U2-Users mailing list
U2-Users@listserver.u2ug.org
http://listserver.u2ug.org/mailman/listinfo/u2-users
___
U2-Users mailing list
U2-Users@listserver.u2ug.org
http://listserver.u2ug.org/mailman/listinfo/u2-users

___
U2-Users mailing list
U2-Users@listserver.u2ug.org
http://listserver.u2ug.org/mailman/listinfo/u2-users

 
___
U2-Users mailing list
U2-Users@listserver.u2ug.org
http://listserver.u2ug.org/mailman/listinfo/u2-users


Re: [U2] [u2] Parallel processing in Universe

2012-10-02 Thread Ross Ferris
Could also avoid the lock contention if each phantom had knowledge of the 
others, so phantom 1 could only process @ID 1, 6, 11 etc., phantom 2 would do 
2,7,12  so on

Of course, if you are operating with a select list, this already implies that 
you have processed the file once, so your batch process is actually a 
re-read, so in the absence of a suitable index, perhaps employing the 
Drumheller trick would be worth consideration 

Ross Ferris
Stamina Software
Visage  Better by Design!


-Original Message-
From: u2-users-boun...@listserver.u2ug.org 
[mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of Wjhonson
Sent: Wednesday, 3 October 2012 3:42 AM
To: u2-users@listserver.u2ug.org
Subject: Re: [U2] [u2] Parallel processing in Universe


The point of the caching concern is related to the read ahead, and you will 
still get some benefit from this, if your five phantoms are reading their 
*portion* of the file in order, which they should.



-Original Message-
From: George Gallen ggal...@wyanokegroup.com
To: U2 Users List u2-users@listserver.u2ug.org
Sent: Tue, Oct 2, 2012 10:39 am
Subject: Re: [U2] [u2] Parallel processing in Universe


If 5 phantoms were running, and read in order but from 5 different starting 
points, the records would Essentially still be processed in a random order, if 
you were to layout the record ID's as they get Processed.

George

-Original Message-
From: u2-users-boun...@listserver.u2ug.org 
[mailto:u2-users-boun...@listserver.u2ug.org]
On Behalf Of Wjhonson
Sent: Tuesday, October 02, 2012 1:35 PM
To: u2-users@listserver.u2ug.org
Subject: Re: [U2] [u2] Parallel processing in Universe


The idea of the phantoms would be to read the file in order, not randomly, just 
inorder from five different starting points.
So you should still get the benefit of some caching.


___
U2-Users mailing list
U2-Users@listserver.u2ug.org
http://listserver.u2ug.org/mailman/listinfo/u2-users

 
___
U2-Users mailing list
U2-Users@listserver.u2ug.org
http://listserver.u2ug.org/mailman/listinfo/u2-users
___
U2-Users mailing list
U2-Users@listserver.u2ug.org
http://listserver.u2ug.org/mailman/listinfo/u2-users


Re: [U2] [u2] Parallel processing in Universe

2012-10-02 Thread Ross Ferris
Depends on what you call a no brainer -- to me, $4K for an 800Mb Intel 910 
SSD seems reasonable for what you get (10x full drive writes every day for 5 
years has the endurance angle covered IMHO - 400Gb is $2K if your database will 
fit)  and by todays standards represents reasonable value. Not quite at 
the performance level of Fusion IO, but cheap enough to just about be 
affordable

Ross Ferris
Stamina Software
Visage  Better by Design!

-Original Message-
From: u2-users-boun...@listserver.u2ug.org 
[mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of David Wolverton 
Sent: Wednesday, 3 October 2012 3:19 AM
To: 'U2 Users List'
Subject: Re: [U2] [u2] Parallel processing in Universe

Great point!!  I think we can agree that 'spinning media latency' is the enemy 
and having phantoms increasing the 'head dance' can make things worse, not 
better!  

Many problems go away or become trivial as the spinning media trails to the 
sunset.  I've advised customers that just moving 'code files' to a tiny SSD 
would likely increase overall system performance on Windows boxes.  Just 
waiting until the price for Enterprise SSDs makes them a no-brainer...
Until then, even small SSDs will help!



-Original Message-
From: u2-users-boun...@listserver.u2ug.org
[mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of Daniel McGrath
Sent: Tuesday, October 02, 2012 12:05 PM
To: U2 Users List
Subject: Re: [U2] [u2] Parallel processing in Universe

You've highlighted one problem here.

By having multiple processes accessing the disk in different locations, you 
destroy cache optimization and seek times. More phantoms = less performance.
This assumes I/O is a bigger concern than CPU, which is generally the case.

More phantoms = more communication, which also adds another overhead that 
reduces performance.

Introducing more phantoms than CPU cores, you increase the amount of context 
switching, which ones again hurts your cache usage as well as adding bigger 
overheads on the CPU again.

In short, except for very specific cases, increasing 'concurrency' through 
phantoms on a single machine is generally ill-advised, resulting in longer 
processing times, higher average system loads and worse yet, greater system 
complexity (and hence ways for things to break).

As mentioned earlier, more system-level architectural changes (such as multiple 
machines, or at least, files storage on different disks/spindles for each 
process) are required if you want benefit from this sort of work.


-Original Message-
From: u2-users-boun...@listserver.u2ug.org
[mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of David Wolverton
Sent: Monday, October 01, 2012 4:47 PM
To: 'U2 Users List'
Subject: Re: [U2] [u2] Parallel processing in Universe

OK - I was trying to create a 'smoother use' of the disk and 'read ahead' -- 
this example the disk would be chattering from the heads moving all over the 
place. I was trying to find a way to make this process more 'orderly' -- is 
there one?

-Original Message-
From: u2-users-boun...@listserver.u2ug.org
[mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of Robert Houben
Sent: Monday, October 01, 2012 4:48 PM
To: U2 Users List
Subject: Re: [U2] [u2] Parallel processing in Universe

Create an index on a dict pointing at the first character of the key, and have 
each phantom take two digits. (0-1, 2-3, 4-5, 6-7, 8-9)

-Original Message-
From: u2-users-boun...@listserver.u2ug.org
[mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of David Wolverton
Sent: October-01-12 2:43 PM
To: 'U2 Users List'
Subject: Re: [U2] [u2] Parallel processing in Universe

So how would a user 'chop up' a file for parallel processing?  Ideally, if here 
was a Mod 10001 file (or whatever) it would seem like it would be 'ideal' to 
assign 2000 groups to 5 phantoms -- but I don't know how 'start a BASIC select 
at Group 2001 or 4001' ...

-Original Message-
From: u2-users-boun...@listserver.u2ug.org
[mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of George Gallen
Sent: Monday, October 01, 2012 3:29 PM
To: U2 Users List
Subject: Re: [U2] [u2] Parallel processing in Universe

0001: OPENSEQ /tmp/pipetest TO F.PIPE ELSE STOP NO PIPE
0002: LOOP
0003:READSEQ LINE FROM F.PIPE ELSE CONTINUE
0004:PRINT LINE
0005: REPEAT
0006: STOP
0007: END

Although, not sure if you might need to sleep a litte between the READSEQ's 
ELSE and CONTINUE
   Might suck up cpu time when nothing is writing to the file.

Then you could setup a printer in UV that did a  cat -  /tmp/pipetest

Now your phantom just needs to print to that printer.

George

-Original Message-
From: u2-users-boun...@listserver.u2ug.org
[mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of George Gallen
Sent: Monday, October 01, 2012 4:16 PM
To: U2 Users List
Subject: Re: [U2] [u2] Parallel processing in Universe

The only thing about a pipe is that once it's closed, I believe it has to 

Re: [U2] [u2] Parallel processing in Universe

2012-10-02 Thread Robert Colquhoun
On Tue, Oct 2, 2012 at 5:58 PM, Symeon Breen syme...@gmail.com wrote:

 However map reduce and hadoop are pretty horrible things. Even Google have
 moved away from it with Caffiene etc.


Going OT a little, i think Google is replacing BigTable which was part of
Caffeine in 2010 with Spanner now.  Here is a doc about it released last
month:

http://static.googleusercontent.com/external_content/untrusted_dlcp/research.google.com/en//archive/spanner-osdi2012.pdf

...amusingly they call it a Multi-Version Database, can't wait till that
gets abbreviated.

- Robert
___
U2-Users mailing list
U2-Users@listserver.u2ug.org
http://listserver.u2ug.org/mailman/listinfo/u2-users