from:"Pranay Kumar Srivastava"

When to use blk_end_request_* routines.

2013-03-14 Thread Pranay Kumar Srivastava

Hi,

I'm using SLES11-SP1 kernel 2.6.32 .12-0.7

I wrote a block driver for an in-memory disk. The driver seems to work fine 
however sometimes when I try to unload the driver the machine freezes, the 
module_exit code looks like the following,

/*We take no more requests!*/
spin_lock_irqsave(pks_disk-queue_lock,flags);
blk_stop_queue(pks_disk-pks_disk_rqq);
spin_unlock_irqrestore(pks_disk-queue_lock,flags);

/*Remove disk*/
del_gendisk(pks_disk-pks_disk_gd);

/*free queue*/
blk_cleanup_queue(pks_disk-pks_disk_rqq);

I searched around and found out that while processing the requests from request 
queue you've got to use one of the blk_end_request* functions (which I did). 
However if I only use

a) blk_end_request: The module removal code freezes machine.

b) blk_end_request_cur: The module removal code works fine.

There's one more
c)blk_end_request_all: Didn't used this one cuz the above one worked :P.

I digged around in the source and found this function
blk_update_request 
 and in the comment section it says  Actual device drivers should use 
blk_end_request instead which I did and the driver module didn't liked it when 
I tried to remove it :(

Now I'm confused as I've got no clue when to use which function can anyone give 
some examples please?

--P.K.S


::DISCLAIMER::


The contents of this e-mail and any attachment(s) are confidential and intended 
for the named recipient(s) only.
E-mail transmission is not guaranteed to be secure or error-free as information 
could be intercepted, corrupted,
lost, destroyed, arrive late or incomplete, or may contain viruses in 
transmission. The e mail and its contents
(with or without referred errors) shall therefore not attach any liability on 
the originator or HCL or its affiliates.
Views or opinions, if any, presented in this email are solely those of the 
author and may not necessarily reflect the
views or opinions of HCL or its affiliates. Any form of reproduction, 
dissemination, copying, disclosure, modification,
distribution and / or publication of this message without the prior written 
consent of authorized representative of
HCL is strictly prohibited. If you have received this email in error please 
delete it and notify the sender immediately.
Before opening any email and/or attachments, please check them for viruses and 
other defects.




___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies

RE:Query on skb buffer (Kumar amit mehta)

2013-03-06 Thread Pranay Kumar Srivastava

 -Original Message-
 From: kernelnewbies-boun...@kernelnewbies.org [mailto:kernelnewbies-
 boun...@kernelnewbies.org] On Behalf Of kernelnewbies-
 requ...@kernelnewbies.org
 Sent: Thursday, March 07, 2013 8:52 AM
 To: kernelnewbies@kernelnewbies.org
 Subject: Kernelnewbies Digest, Vol 28, Issue 12

 Send Kernelnewbies mailing list submissions to
   kernelnewbies@kernelnewbies.org

 To subscribe or unsubscribe via the World Wide Web, visit
   http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
 or, via email, send a message with subject or body 'help' to
   kernelnewbies-requ...@kernelnewbies.org

 You can reach the person managing the list at
   kernelnewbies-ow...@kernelnewbies.org

 When replying, please edit your Subject line so it is more specific than Re:
 Contents of Kernelnewbies digest...

 Today's Topics:

1. Query on skb buffer  (Kumar amit mehta)
2. Re: Query on skb buffer (valdis.kletni...@vt.edu)
3. Several unrelated beginner questions. (Konstantin Kowalski)
4. Re: Several unrelated beginner questions. (Gaurav Jain)
5. Re: Several unrelated beginner questions.
   (valdis.kletni...@vt.edu)
6. zap_low_mappings (ishare)
7. Re: zap_low_mappings (valdis.kletni...@vt.edu)

 --

 Message: 1
 Date: Wed, 6 Mar 2013 10:39:13 -0800
 From: Kumar amit mehta gmate.a...@gmail.com
 Subject: Query on skb buffer
 To: kernelnewbies@kernelnewbies.org
 Message-ID: 20130306183913.ga3...@gmail.com
 Content-Type: text/plain; charset=us-ascii

 My current understanding is that the skb, while being passed along various
 layers in linux network stack, will be manipulated majorly, using the
 skb-{head|data|tail|end|len} fields.

 Suppose that my application (say 'ping') sends a ICMP echo request with a
 large packet size of 4k, i.e. $ ping -s 4096 dest addr Now, if 
 alloc_skb(4096,
 GFP_KERNEL) is the routine that gets called to allocate the kernel buffer
 then, how does the kernel manages such prospective memory allocation
 failures and how kernel manages large packet requests from the application.

 -Amit
[Pranay Kumar Srivastava] Perhaps you should've a look at linear and non-linear 
data (skb_frags to be specific). That's how large data is handled however I 
don't think you'll be doing that with ICMP or UDP. Reading directly from 
skbuffs for UDP would also give you header information however with TCP it 
doesn't. So unless there's any need for it perhaps it can be done in userland 
or use sock_sendmsg or sendfile (for zero copy).
--P.K.S

 --

 Message: 2
 Date: Wed, 06 Mar 2013 14:32:27 -0500
 From: valdis.kletni...@vt.edu
 Subject: Re: Query on skb buffer
 To: Kumar amit mehta gmate.a...@gmail.com
 Cc: kernelnewbies@kernelnewbies.org
 Message-ID: 9932.1362598...@turing-police.cc.vt.edu
 Content-Type: text/plain; charset=us-ascii

 On Wed, 06 Mar 2013 10:39:13 -0800, Kumar amit mehta said:

  Now, if alloc_skb(4096, GFP_KERNEL) is the routine that gets called to
  allocate the kernel buffer then, how does the kernel manages such
  prospective memory allocation failures and how kernel manages large
  packet requests from the application.

 Did you actually look at the source for use of alloc_skb() and how it handles
 error returns?

 (Hint - the kernel doesn't do the same thing at every use of alloc_skb(),
 because an allocation failure needs to be handled differently depending on
 where it happens.  At some places, just bailing out and dropping the packet
 on the floor without any notification to anybody is appropriate.  At other
 places, we need to propagate an error condition to the caller).

 Typical pattern (from net/core/sock.c:)

 /*
  * Allocate a skb from the socket's send buffer.
  */
 struct sk_buff *sock_wmalloc(struct sock *sk, unsigned long size, int force,
  gfp_t priority) {
 if (force || atomic_read(sk-sk_wmem_alloc)  sk-sk_sndbuf) {
 struct sk_buff *skb = alloc_skb(size, priority);
 if (skb) {
 skb_set_owner_w(skb, sk);
 return skb;
 }
 }
 return NULL;
 }
 EXPORT_SYMBOL(sock_wmalloc);

 and then the caller does something like this (net/ipv4/ip_output.c, in
 function __ip_append_data():

  } else {
 skb = NULL;
 if (atomic_read(sk-sk_wmem_alloc) =
 2 * sk-sk_sndbuf)
 skb = sock_wmalloc(sk,
alloclen + hh_len 
 + 15, 1,
sk-sk_allocation);
 if (unlikely(skb == NULL))
 err = -ENOBUFS

RE: Major/minor numbers

2013-03-05 Thread Pranay Kumar Srivastava

, read source code and/or comments and figure out what
  Linux does to prevent the 'thundering herd' problem (consider 100
  threads all waiting on the same mutex - if you blindly wake all 100
  up, you'll schedule them all, the first will find the mutex available
  and then re-take it, and then the next 99 will get run only to find it
  contended and go back to sleep.  So figure out what Linux does in that
  case. :)
 
 Googling around, I found the 'thundering herd' being mentioned in relation
 to threads waiting on sockets using the accept() sys call.
 Are wait's on mutex's also plagued by the same issue? I guess it is, though
 what sys call would be used in this case?
 
 Thanks,
 -mandeep
 
 
 
 --
 
 Message: 9
 Date: Tue, 5 Mar 2013 18:05:46 +0800
 From: ishare june.tune@gmail.com
 Subject: Re: pthread_lock
 To: kernelnewbies@kernelnewbies.org
 Message-ID: 20130305100546.GA2541@debian.localdomain
 Content-Type: text/plain; charset=us-ascii
 
 On Tue, Mar 05, 2013 at 01:39:54PM +0530, Mandeep Sandhu wrote:
  On Tue, Mar 5, 2013 at 11:32 AM,  valdis.kletni...@vt.edu wrote:
   On Tue, 05 Mar 2013 11:02:45 +0530, Mandeep Sandhu said:
  
   next schedule. I think the waiting threads (processes) will moved
   from the wait queue to the run queue from where they will be
   scheduled to run.
  
   For bonus points, read source code and/or comments and figure out
   what Linux does to prevent the 'thundering herd' problem (consider
   100 threads all waiting on the same mutex - if you blindly wake all
   100 up, you'll schedule them all, the first will find the mutex
   available and then re-take it, and then the next 99 will get run
   only to find it contended and go back to sleep.  So figure out what
   Linux does in that case. :)
 
  Googling around, I found the 'thundering herd' being mentioned in
  relation to threads waiting on sockets using the accept() sys call.
  Are wait's on mutex's also plagued by the same issue? I guess it is,
  though what sys call would be used in this case?
 
  the threads waiting on sockets will be waked up by net event.
  similarly,the waiters  on mutex's can be wake up by signal.I guess it is
 pthread_cont_signal
 
 
 
  Thanks,
  -mandeep
 
 
 
 --
 
 Message: 10
 Date: Tue, 5 Mar 2013 12:21:47 +
 From: Anuz Pratap Singh Tomar chambilketha...@gmail.com
 Subject: Re: Major/minor numbers
 To: Shraddha Kamat sh200...@gmail.com
 Cc: kernelnewbies kernelnewbies@kernelnewbies.org
 Message-ID:
   CAJnfX5uEbxk2kPLqtS3KsvH01C+iPrj3K3qdgunq9SbrD9GBnQ@mail.
 gmail.com
 Content-Type: text/plain; charset=iso-8859-1
 
 On Tue, Mar 5, 2013 at 6:32 AM, Shraddha Kamat sh200...@gmail.com
 wrote:
 
  Does the max number of devices supported by Linux limited by major
  minor number ? Can you please give me some pointers regarding this.
 
 http://stackoverflow.com/questions/14833467/maximum-values-of-major-
 and-minor-numbers-in-linux

[Pranay Kumar Srivastava] Just another point you should be careful when trying 
to get device number from user land to kernel. I've got 2.6.32 SLES 11-SP1 and 
the glibc provided major,minor and makedev functions are quite different from 
what you'll encounter in kernel in form of macros. Apparently the userland 
still uses 16 bit numbers on my SLES11-SP1 while internally kernel uses 32 bit 
device numbers(split of 12bits,20bits). If you are thinking of passing direct 
device numbers from user land to kernel refrain from it. Instead pass them 
separately as major and minor numbers and then combine them using macros MKDEV 
within kernel.

--P.K.S
  -- Shraddha
 
 
  ___
  Kernelnewbies mailing list
  Kernelnewbies@kernelnewbies.org
  http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
 
 
 
 
 --
 Thank you
 Warm Regards
 Anuz
 -- next part --
 An HTML attachment was scrubbed...
 URL:
 http://lists.kernelnewbies.org/pipermail/kernelnewbies/attachments/20130
 305/47f643b5/attachment.html
 
 --
 
 ___
 Kernelnewbies mailing list
 Kernelnewbies@kernelnewbies.org
 http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
 
 
 End of Kernelnewbies Digest, Vol 28, Issue 9
 


::DISCLAIMER::


The contents of this e-mail and any attachment(s) are confidential and intended 
for the named recipient(s) only.
E-mail transmission is not guaranteed to be secure or error-free as information 
could be intercepted, corrupted,
lost, destroyed, arrive late or incomplete, or may contain viruses in 
transmission. The e mail and its contents
(with or without referred errors) shall therefore not attach any liability on 
the originator or HCL or its affiliates.
Views

RE: Major/minor numbers

2013-03-05 Thread Pranay Kumar Srivastava






::DISCLAIMER::
The contents of this e-mail and any attachment(s) are confidential and intended for the named recipient(s) only.E-mail transmission is not guaranteed to be secure or error-free as information could be intercepted, corrupted, lost, destroyed, arrive late or incomplete, or may contain viruses in transmission. The e mail and its contents (with or without referred errors) shall therefore not attach any liability on the originator or HCL or its affiliates. Views or opinions, if any, presented in this email are solely those of the author and may not necessarily reflect the views or opinions of HCL or its affiliates. Any form of reproduction, dissemination, copying, disclosure, modification, distribution and / or publication of this message without the prior written consent of authorized representative of HCL is strictly prohibited. If you have received this email in error please delete it and notify the sender immediately. Before opening any email and/or attachments, please check them for viruses and other defects.


___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies

RE: barrier()

2013-02-25 Thread Pranay Kumar Srivastava

)); printf(tmp_long_ptr
 :0x%llx\n, (long long int) tmp_long_ptr);
if ((long_ptr == tmp_long_ptr)  (long_ptr = 0x3000))
   printf(valid 64 addr\n);
   }
 -- next part --
 An HTML attachment was scrubbed...
 URL:
 http://lists.kernelnewbies.org/pipermail/kernelnewbies/attachments/20130
 224/b8a6e457/attachment-0001.html
 
 --
 
 Message: 4
 Date: Mon, 25 Feb 2013 12:26:06 +0530
 From: Shraddha Kamat sh200...@gmail.com
 Subject: barrier()
 To: kernelnewbies kernelnewbies@kernelnewbies.org
 Message-ID: 1361775366.22170.27.ca...@oc5268484881.ibm.com
 Content-Type: text/plain; charset=UTF-8
 
 #define barrier() asm volatile( ::: memory)
 
 What exactly volatile( ::: memory)  doing here ?
 I was referring to gnu as (ver 2.14) manual but could not get much clue about
 this assembly construct - any pointers ?
[Pranay Kumar Srivastava] From the extended GCC inline assembly manual the last 
: begins the list of clobber registers so that GCC accidentally doesn't use 
them if you've used some of the registers. I looked upon this page 
http://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html and its written here 
something about the special memory keyword. It goes something like the 
following 

 If your assembler instructions access memory in an unpredictable fashion, add 
`memory' to the list of clobbered registers. This causes GCC to not keep memory 
values cached in registers across the assembler instruction and not optimize 
stores or loads to that memory. You also should add the volatile keyword if the 
memory affected is not listed in the inputs or outputs of the asm, as the 
`memory' clobber does not count as a side-effect of the asm. If you know how 
large the accessed memory is, you can add it as input or output but if this is 
not known, you should add `memory' 

There's also a nice small example on that page. I hope that helps.
Regards
P.K.S
 
 
 
 
 
 --
 
 Message: 5
 Date: Mon, 25 Feb 2013 15:02:39 +0800
 From: bill4carson bill4car...@gmail.com
 Subject: Re: test jiffies on ARM SMP board
 To: kernelnewbies@kernelnewbies.org
 Message-ID: 512b0c8f.30...@gmail.com
 Content-Type: text/plain; charset=UTF-8; format=flowed
 
 
 
 On 2013?02?21? 00:39, buyitian wrote:
  i am confused about my test. in one device driver,
  i put below code:
 
   printk(start to test test jiffies\n);
 
   local_irq_save(flags);
 
   jf1 = jiffies; // read jiffies first time
 
   // hold cpu for about 2 seconds(do some calculation)
 
   jf2 = jiffies; // read jiffies after 2 seconds
 
   local_irq_restore(flags);
 
   printk(jf1:%lu, jf2:%lu\n, jf1, jf2);
 
  and the output is as below:
 
   4[  108.551124]start to test test jiffies
   4[  110.367604]jf1:4294948151, jf2:4294948151
 
  the jf1 and jf2 are the same value, although they are
  read between 2 seconds interval, i think this is because
  i disabled local interrupt.
  but the printk timestamp is from 108.551124 to 110.367604,
  which is about 2 seconds. and on my platform, printk timestamp
  is got from the function read_sched_clock:
  static u32 __read_mostly (*read_sched_clock)(void) =
 jiffy_sched_clock_read;
 
  and function jiffy_sched_clock_read() is to read from jiffies.
 
  it seems that the jiffies is frozen when local irq is disabled,
  but after local_irq_restore(), the jiffies not only start
  to run, but also recover the lost 2 seconds.
 
  is the jiffies updated from another cpu when irq is disabled on
  local cpu?
 
  is there some internel processor interrupt between cpu1 and cpu0
  after local irq is re-enabled so that jiffies recover the lost 2 seconds?
 
 
   80 /*
   81  * Event handler for periodic ticks
   82  */
   83 void tick_handle_periodic(struct clock_event_device *dev)
   84 {
   85 int cpu = smp_processor_id();
   86 ktime_t next;
   87
   88 tick_periodic(cpu);
   89
   90 if (dev-mode != CLOCK_EVT_MODE_ONESHOT)
   91 return;
   92 /*
   93  * Setup the next period for devices, which do not have
   94  * periodic mode:
   95  */
   96 next = ktime_add(dev-next_event, tick_period);
   97 for (;;) {
   98 if (!clockevents_program_event(dev, next, ktime_get()))   --- 
 once
 irq enabled, here we got -ETIME, then
   99 return;
 100 /*
 101  * Have to be careful here. If we're in oneshot mode,
 102  * before we call tick_periodic() in a loop, we need
 103  * to be sure we're using a real hardware clocksource.
 104  * Otherwise we could get trapped in an infinite
 105  * loop, as the tick_periodic() increments jiffies,
 106  * when then will increment time, posibly causing
 107  * the loop to trigger again and again.
 108  */
 109 if (timekeeping_valid_for_hres())
 110 tick_periodic(cpu);   
 here, we add missing
 jiffies
 111 next

RE: Where does kernel store per task file position?

2013-01-29 Thread Pranay Kumar Srivastava

 -Original Message-
 From: Rajat Sharma [mailto:fs.ra...@gmail.com]
 Sent: Wednesday, January 30, 2013 11:16 AM
 To: Pranay Kumar Srivastava
 Cc: kernelnewbies@kernelnewbies.org
 Subject: Re: Where does kernel store per task file position?

  I'm still not able to figure out where exactly is the position of file 
  stored per
 task_struct.
 struct file * itself is per process (task_struct) so file-f_pos is file 
 position per
 process, if thats what you are looking for. I hope you haven't assumed that
 struct file itself is unique for a file, i.e. per inode? Then that assumption 
 is
 wrong.
 -Rajat

[Pranay Kumar Srivastava] That really was a stupid question, it says right 
there get_empty_filp() in do_sys_open. For forks the inherited file have common 
struct file [Correct?] but for the files opened after fork in child/parent will 
not have shared struct file[Correct?].  So the same dentry can be pointed to by 
multiple struct file[Correct?] that's why there's an increment of dentry while 
doing lookup[Correct?].

Thanks a lot!

 On Tue, Jan 29, 2013 at 6:38 PM, Pranay Kumar Srivastava
 pranay.shrivast...@hcl.com wrote:
 Hi Everyone,

 I was trying to find out where does Linux store per process file position?
 Since struct file is allocated once when the file is first opened
 (get_empty_filp() via do_sys_open) .I looked at these,

 Copy_process---copy_files--dup_fd  it seemed to allocate only (struct
 file*)

 struct files_struct , but I couldn't find any field that is actually being 
 used to
 store the file position.

 I'm still not able to figure out where exactly is the position of file stored 
 per
 task_struct. Secondly even if this was being saved does the kernel changes
 f_pos of struct file whenever a (read/write) is done? I don't that happens
 [Correct?].

 Regards,
 Pranay Kumar Srivastava

 ::DISCLAIMER::
 --
 --

 The contents of this e-mail and any attachment(s) are confidential and
 intended for the named recipient(s) only.
 E-mail transmission is not guaranteed to be secure or error-free as
 information could be intercepted, corrupted, lost, destroyed, arrive late or
 incomplete, or may contain viruses in transmission. The e mail and its
 contents (with or without referred errors) shall therefore not attach any
 liability on the originator or HCL or its affiliates.
 Views or opinions, if any, presented in this email are solely those of the
 author and may not necessarily reflect the views or opinions of HCL or its
 affiliates. Any form of reproduction, dissemination, copying, disclosure,
 modification, distribution and / or publication of this message without the
 prior written consent of authorized representative of HCL is strictly
 prohibited. If you have received this email in error please delete it and 
 notify
 the sender immediately.
 Before opening any email and/or attachments, please check them for viruses
 and other defects.

 --
 --

 ___
 Kernelnewbies mailing list
 Kernelnewbies@kernelnewbies.org
 http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies

___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies

How to Faking a PCI or USB device.

2012-11-26 Thread Pranay Kumar Srivastava

Hi Everyone,

I'm learning how to write PCI and USB device drivers however I don't have any 
real device to work with. Most articles I read either have some USB 
device(real) or they just tell how it works(like structures and api).

Is it possible to fake such a device that probably does nothing but I can say 
modify some parameters of the fake device? In short is it possible to devise a 
fake configuration address space of the fake device and try to use it 
(completely in memory device)?.  If it can be done please give me some pointers.

Thanks for reading!




::DISCLAIMER::


The contents of this e-mail and any attachment(s) are confidential and intended 
for the named recipient(s) only.
E-mail transmission is not guaranteed to be secure or error-free as information 
could be intercepted, corrupted,
lost, destroyed, arrive late or incomplete, or may contain viruses in 
transmission. The e mail and its contents
(with or without referred errors) shall therefore not attach any liability on 
the originator or HCL or its affiliates.
Views or opinions, if any, presented in this email are solely those of the 
author and may not necessarily reflect the
views or opinions of HCL or its affiliates. Any form of reproduction, 
dissemination, copying, disclosure, modification,
distribution and / or publication of this message without the prior written 
consent of authorized representative of
HCL is strictly prohibited. If you have received this email in error please 
delete it and notify the sender immediately.
Before opening any email and/or attachments, please check them for viruses and 
other defects.




___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies

RE: How to Faking a PCI or USB device.

2012-11-26 Thread Pranay Kumar Srivastava

 -Original Message-
 From: Mulyadi Santosa [mailto:mulyadi.sant...@gmail.com]
 Sent: Monday, November 26, 2012 4:07 PM
 To: Pranay Kumar Srivastava
 Cc: kernelnewbies@kernelnewbies.org
 Subject: Re: How to Faking a PCI or USB device.

 Hi..

 On Mon, Nov 26, 2012 at 5:27 PM, Pranay Kumar Srivastava
 pranay.shrivast...@hcl.com wrote:

  Is it possible to fake such a device that probably does nothing but I can 
  say
 modify some parameters of the fake device? In short is it possible to devise a
 fake configuration address space of the fake device and try to use it
 (completely in memory device)?.  If it can be done please give me some
 pointers.

 what kind of device?

Any kind. I was looking around to find simplest of USB and PCI devices. I Found 
one article by Greg http://www.linuxjournal.com/article/7353 written a long 
time back, so maybe something like this. If I can find some cheap hardware like 
it I would like to use it to decorate my desk for new year :P. But while I find 
a simple hardware what would I need to know to fake such a simple device?

 maybe if you're lucky, QEMU can emulate that for you...e.g: network
 adapter...

Well let's just say it emulates network adapter (PCI right?) so wouldn't a 
default driver exist for that? So I should just remove that driver and use my 
driver instead? 

 --
 regards,

 Mulyadi Santosa
 Freelance Linux trainer and consultant

 blog: the-hydra.blogspot.com
 training: mulyaditraining.blogspot.com

Thanks for the help I really appreciate it.

::DISCLAIMER::

The contents of this e-mail and any attachment(s) are confidential and intended 
for the named recipient(s) only.
E-mail transmission is not guaranteed to be secure or error-free as information 
could be intercepted, corrupted,
lost, destroyed, arrive late or incomplete, or may contain viruses in 
transmission. The e mail and its contents
(with or without referred errors) shall therefore not attach any liability on 
the originator or HCL or its affiliates.
Views or opinions, if any, presented in this email are solely those of the 
author and may not necessarily reflect the
views or opinions of HCL or its affiliates. Any form of reproduction, 
dissemination, copying, disclosure, modification,
distribution and / or publication of this message without the prior written 
consent of authorized representative of
HCL is strictly prohibited. If you have received this email in error please 
delete it and notify the sender immediately.
Before opening any email and/or attachments, please check them for viruses and 
other defects.

___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies

Query regarding some blk_queue_XXX functions. (For kernel 2.6.32 SLES 11)

2012-09-03 Thread Pranay Kumar Srivastava

Hi All,

I'm using SLES11 (kernel 2.6.32)


I recently did a dummy code for block device driver where the whole disk is 
made of pages in memory. I have some query regarding usage of some of the 
functions,

I've absolutely no idea what the first 2 do. What I was trying to do was that I 
wanted the block layer to give my driver requests which are aligned according 
to my device's minimum sector size. Since I wanted to play with the driver I 
chose 1024 bytes instead of 512, however I've tried with 2048 bytes device 
sector size as well and all seems well :D

Now the problem is kernel wants to give everything in 512 byte but I want to 
receive everything aligned properly according to my device's sector size. So I 
looked in kernel code and I thought I could use the functions below...

1. blk_limits_io_min: 
No clue about this one, in-fact the device didn't got added when I used this.


2. blk_queue_physical_block_size:
This one checks the logical block size before it sets physical block size 
itself. But not really sure about it's purpose

3. blk_queue_logical_block_size:
I think this is the one responsible that I get requests properly aligned to 
device's sector size? Not really sure...


4. set_capacity:
There's also a callback in block_device_operations by this name? However this 
function just sets the size of part0 of the gendisk. I've not seen the callback 
being called upon, I used some print statements and I never see them. So what's 
the purpose of the callback?

Lastly:
There's a field gd-start in gendisk which apparently sets the start of data 
sector? But why would block layer be bothered with that? Is this used by 
filesystems since application programs certainly can write to sector 0 as well 
right?


And a couple more :P ,

When I try to use fdisk on the disk, it seems to create partition table (I 
think), however when it tries to read the partition table again I get invalid 
argument to ioctl. So I think I'm missing an ioctl call in my driver correct? 
I've none yet though. 

However this issue aside, if I'm able to create a partition then my driver 
should be able to handle additional disks correct? And it would follow the same 
operations when I added my module and called add_disk the first time correct? 

Thanks a lot for reading!




Regards,
Pranay Kr. Srivastava
pranay.shrivast...@hcl.com
Software Engineer
ERS,HCL Technologies
A-5, Sector 24, Noida 201301, U.P. (India)



::DISCLAIMER::


The contents of this e-mail and any attachment(s) are confidential and intended 
for the named recipient(s) only.
E-mail transmission is not guaranteed to be secure or error-free as information 
could be intercepted, corrupted,
lost, destroyed, arrive late or incomplete, or may contain viruses in 
transmission. The e mail and its contents
(with or without referred errors) shall therefore not attach any liability on 
the originator or HCL or its affiliates.
Views or opinions, if any, presented in this email are solely those of the 
author and may not necessarily reflect the
views or opinions of HCL or its affiliates. Any form of reproduction, 
dissemination, copying, disclosure, modification,
distribution and / or publication of this message without the prior written 
consent of authorized representative of
HCL is strictly prohibited. If you have received this email in error please 
delete it and notify the sender immediately.
Before opening any email and/or attachments, please check them for viruses and 
other defects.




___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies

Re: Sysfs class attribute problem

2012-06-21 Thread Pranay Kumar Srivastava

 -Original Message-
 From: kernelnewbies-boun...@kernelnewbies.org [mailto:kernelnewbies-
 boun...@kernelnewbies.org] On Behalf Of kernelnewbies-
 requ...@kernelnewbies.org
 Sent: Wednesday, June 20, 2012 11:33 PM
 To: kernelnewbies@kernelnewbies.org
 Subject: Kernelnewbies Digest, Vol 19, Issue 39

 Send Kernelnewbies mailing list submissions to
   kernelnewbies@kernelnewbies.org

 To subscribe or unsubscribe via the World Wide Web, visit
   http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
 or, via email, send a message with subject or body 'help' to
   kernelnewbies-requ...@kernelnewbies.org

 You can reach the person managing the list at
   kernelnewbies-ow...@kernelnewbies.org

 When replying, please edit your Subject line so it is more specific
 than Re: Contents of Kernelnewbies digest...

 Today's Topics:

1. RE: A confusion about invoking my syscall (Jeff Haran)
2. Re: Sysfs class attribute problem (Jeshwanth Kumar N K Jeshu)

 --

 Message: 1
 Date: Wed, 20 Jun 2012 16:58:30 +
 From: Jeff Haran jha...@bytemobile.com
 Subject: RE: A confusion about invoking my syscall
 To: ?? wangzhe5...@gmail.com
 Cc: kernelnewbies kernelnewbies@kernelnewbies.org
 Message-ID:
   4ab110e2fa959f41ac8480e84516371105b...@hq-ex01.bytemobile.com
 Content-Type: text/plain; charset=iso-2022-jp

 From: ?? [mailto:wangzhe5...@gmail.com]
 Sent: Wednesday, June 20, 2012 1:16 AM
 To: Jeff Haran
 Cc: kernelnewbies
 Subject: Re: A confusion about invoking my syscall

 2012/6/20 Jeff Haran
 jha...@bytemobile.commailto:jha...@bytemobile.com

 From: ?? [mailto:wangzhe5...@gmail.commailto:wangzhe5...@gmail.com]
 Sent: Monday, June 18, 2012 9:32 PM
 To: Jeff Haran
 Cc: kernelnewbies
 Subject: Re: A confusion about invoking my syscall

 2012/6/19 Jeff Haran
 jha...@bytemobile.commailto:jha...@bytemobile.com

 From: kernelnewbies-boun...@kernelnewbies.orgmailto:kernelnewbies-
 boun...@kernelnewbies.org [mailto:kernelnewbies-
 boun...@kernelnewbies.orgmailto:kernelnewbies-
 boun...@kernelnewbies.org] On Behalf Of ??
 Sent: Monday, June 18, 2012 6:40 PM
 To: kernelnewbies
 Subject: A confusion about invoking my syscall

 Hello everyone:

  I append a simple syscall in kernel. and the function is as
 follows:

   asmlinkage  long sys_mysyscall(long data)
  {
   printk(This is my syscall!\n);
   return data;
   }

 and i test it sucessfully in user space . and the test program:

#include linux/unistd.h
#include syscall.h
#include sys/types.h
#include stdio.h

int main(void)
{
long n = 0,m = 0,pid1,pid2;
n = syscall(345,190);// #define __NR_mysyscall  345
printf(n = %ld\n,n);
pid1 = syscall(SYS_getpid);  //getpid
printf(pid = %ld\n,pid1);
pid2 = syscall(20);  //getpid
printf(pid = %ld\n,pid2);
return 0;
   }
 and the result:
 n = 190
 pid = 4097
 pid = 4097

 but if the test program is:
 #include linux/unistd.h
 #include syscall.h
 #include sys/types.h
 #include stdio.h

 int main(void)
 {
  long n = 0,m = 0,pid1,pid2;
  n = syscall(345,190);// #define __NR_mysyscall  345
  printf(n = %ld\n,n);
  m = syscall(SYS_mysyscall,190);
  printf(m = %ld\n,m);
  pid1 = syscall(SYS_getpid);  //getpid
  printf(pid = %ld\n,pid1);
  pid2 = syscall(20);  //getpid
  printf(pid = %ld\n,pid2);
  return 0;
 }
 and the result:
 wanny@wanny-C-Notebook-:~/syscall/src$ gcc test1.c
 test1.c: In function ?main?:
 test1.c:13:14: error: ?SYS_mysyscall? undeclared (first use in this
 function)
 test1.c:13:14: note: each undeclared identifier is reported only once
 for each function it appears in

 why i can't invoke my syscall with SYS_mysyscall?

 Thanks in advance!
 Because it appears you never defined the symbol SYS_mysyscall.
  I think so,but where shoud i defne the  symbol SYS_mysyscall ?
   and where is the symbol SYS_getpid defined?
 On my system /usr/include/bits/syscall.h, which is being included in
 your program because it includes syscall.h.
83 #define SYS_getpid __NR_getpid  ?so SYS_getpid is
 replaced by __NR_getpid. and __NR_getpid was defined in the
 kernel(arch/x86/include/asm/unistd_32.h). and my syscall was also
 defined there.#define SYS_mysyscall __NR_mysyscall, i don't kown why it
 doesn't works.

 My sources contain no reference to SYS_mysyscall nor __NR_mysyscall, so
 I assume you?ve added them to the Linux include files that you built
 your module from.

 User space programs like your main() program above generally aren?t
 going to include Linux source tree include files. When you include
 syscall.h from a user space program in a typical development
 environment, the compiler is by default going to look for syscall.h in
 /usr/include, not in the Linux source tree where presumably you?ve made
 your modifications. Of course you can always tell the compiler to look

kernel_sendpage query.

2012-06-09 Thread Pranay Kumar Srivastava

Hi,

I've been trying to understand kernel_sendpage but I've not been able to figure 
it out completely and hopefully someone else knows better so please help me out 
on this.

I'm using kernel_sendpage for a TCP connection and it works well when there are 
lesser number of kernel threads trying to send data using it.
Now the page I hand over to kernel_sendpage is reused again for reading data 
from the socket and then processing it and then again resending the processed 
data in the same page again. It's at maximum 2KB data and never lesser than 120 
bytes.

As I see it in the code, the page isn't copied in the skb frags array it's just 
assigned and get_page is called to increment the page reference count. (I don't 
free it anyway until the thread is stopped and it never is unless it gets a 
signal).

Now I don't know wether kernel_sendpage will wait for the page to be sent or it 
won't. I've tried with MSG_DONTWAIT and passing 0 for flags but after every now 
and then the problem occurs at client which I'm describing below with the best 
explanation I could think of...

When too many kernel threads are trying to send data using 
kernel_sendpage, with NO MSG_DONTWAIT flag, then also it seems that this call 
succeeds? However since I'm reusing the page the data can get overridden by the 
next sock_recvmsg and when the network stack is ready to send my page it gets 
garbage data at client?

The same issue I observed with MSG_DONTWAIT set even in that case the client 
sometimes get garbage data.

So my query is,

To use kernel_sendpage what I need to do in order to be sure that network stack 
indeed has sent my page and that I can reuse it for sock_recvmsg again.

Thanks a lot for reading!




Regards,
Pranay Kr. Srivastava
pranay.shrivast...@hcl.com
Software Engineer
ERS,HCL Technologies
A-5, Sector 24, Noida 201301, U.P. (India)



::DISCLAIMER::


The contents of this e-mail and any attachment(s) are confidential and intended 
for the named recipient(s) only.
E-mail transmission is not guaranteed to be secure or error-free as information 
could be intercepted, corrupted,
lost, destroyed, arrive late or incomplete, or may contain viruses in 
transmission. The e mail and its contents
(with or without referred errors) shall therefore not attach any liability on 
the originator or HCL or its affiliates.
Views or opinions, if any, presented in this email are solely those of the 
author and may not necessarily reflect the
views or opinions of HCL or its affiliates. Any form of reproduction, 
dissemination, copying, disclosure, modification,
distribution and / or publication of this message without the prior written 
consent of authorized representative of
HCL is strictly prohibited. If you have received this email in error please 
delete it and notify the sender immediately.
Before opening any email and/or attachments, please check them for viruses and 
other defects.




___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies

RE: Crash when sending a lot of messages through a unix socket

2012-05-02 Thread Pranay Kumar Srivastava

You mentioned your code is atomic. Are you certain that you can do a 
sock_sendmsg from a non process context? I believe you can't do that from a non 
process context cuz sock_sendmsg--udp_sendmsg for DGRAM and i think that 
assumes you are in process context.


From: kernelnewbies-boun...@kernelnewbies.org 
[kernelnewbies-boun...@kernelnewbies.org] On Behalf Of 
kernelnewbies-requ...@kernelnewbies.org 
[kernelnewbies-requ...@kernelnewbies.org]
Sent: Wednesday, May 02, 2012 9:30 PM
To: kernelnewbies@kernelnewbies.org
Subject: Kernelnewbies Digest, Vol 18, Issue 2

Send Kernelnewbies mailing list submissions to
kernelnewbies@kernelnewbies.org

To subscribe or unsubscribe via the World Wide Web, visit
http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
or, via email, send a message with subject or body 'help' to
kernelnewbies-requ...@kernelnewbies.org

You can reach the person managing the list at
kernelnewbies-ow...@kernelnewbies.org

When replying, please edit your Subject line so it is more specific
than Re: Contents of Kernelnewbies digest...


Today's Topics:

   1. please report distros with CONFIG_DYNAMIC_DEBUG, using
  ddebug_query= boot param (Jim Cromie)
   2. Re: immutable wiki? (mic...@michaelblizek.twilightparadox.com)
   3. Re: immutable wiki? (Bill Traynor)
   4. Crash when sending a lot of messages through a unix socket
  (Panagiotis Sakkos)


--

Message: 1
Date: Tue, 1 May 2012 12:36:04 -0600
From: Jim Cromie jim.cro...@gmail.com
Subject: please report distros with CONFIG_DYNAMIC_DEBUG, using
ddebug_query=   boot param
To: kernelnewbies kernelnewbies@kernelnewbies.org
Message-ID:
cajfubxw8ayimau7ko42vh_e-zzbnskw9hd8dx2tfcowl4ue...@mail.gmail.com
Content-Type: text/plain; charset=ISO-8859-1

hi all,

Ive been asked whether ddebug_query= boot param is used in any distros,
I think the question seeks to determine a good deprecation schedule for it
(its been obsoleted by dyndbg= in driver-core-next)

Would you all be so kind as to check your favorite distros, and report
the ones that
have one or both ?

Ubuntu 12.04 LTS lacks it:

jimc@chumly:~/projects/lx/linux-2.6$ grep DYNAMIC_DEBUG /boot/config*
/boot/config-3.0.0-17-generic:# CONFIG_DYNAMIC_DEBUG is not set
/boot/config-3.2.0-24-generic:# CONFIG_DYNAMIC_DEBUG is not set

Voyage linux also lacks it, (also debian based)


Fedora-16 has config option:

$ uname -a
Linux groucho.jimc.earth 3.3.2-6.fc16.x86_64 #1 SMP Sat Apr 21
12:43:20 UTC 2012 x86_64 x86_64 x86_64 GNU/Linux
$ grep DYNAMIC_DEBUG /boot/config-3.3.*
/boot/config-3.3.1-5.fc16.x86_64:CONFIG_DYNAMIC_DEBUG=y
/boot/config-3.3.2-1.fc16.x86_64:CONFIG_DYNAMIC_DEBUG=y
/boot/config-3.3.2-6.fc16.x86_64:CONFIG_DYNAMIC_DEBUG=y

but its not used by default. I have modified my grub defaults:

$ grep ddebug_query /etc/default/grub
GRUB_CMDLINE_LINUX='quiet rhgb loglevel=8 ddebug_query=module params
+p dynamic_debug.verbose=1 nouveau.dyndbg nouveau.force_post=1
it87.dyndbg=+p nouveau.perflvl_wr='

FYI, the above usage will eventually be unsupported,
it can be replaced in driver-core-next by either of:

  dyndbg=module params +p
  params.dyndbg=+p
  params.dyndbg  # defaults to +p

the 2nd, 3rd forms also work for loadable modules (though params isnt
one of them)


thanks in advance
Jim



--

Message: 2
Date: Wed, 2 May 2012 12:52:38 +0200
From: mic...@michaelblizek.twilightparadox.com
Subject: Re: immutable wiki?
To: Bill Traynor w...@alphatroop.com
Cc: kernelnewbies@kernelnewbies.org
Message-ID: 20120502105237.GA2229@grml
Content-Type: text/plain; charset=us-ascii

Hi!

On 09:29 Tue 01 May , Bill Traynor wrote:
 I have an Editor account for the kernelnewbies.org wiki, but all pages
 are currently immutable.  Was the wiki made read-only at some point?

You must be on http://kernelnewbies.org/EditorsGroup to make changes. This
was made for spam procection. Everybody who is on the list can add you. What
is your username in the wiki?

-Michi
--
programing a layer 3+4 network protocol for mesh networks
see http://michaelblizek.twilightparadox.com



--

Message: 3
Date: Wed, 2 May 2012 08:06:36 -0400
From: Bill Traynor w...@alphatroop.com
Subject: Re: immutable wiki?
To: mic...@michaelblizek.twilightparadox.com
Cc: kernelnewbies@kernelnewbies.org
Message-ID:
CAGfZjq69kO91rbiG8R5jTUmRDq3ZJRjp_oTLqS8wji1M=1n...@mail.gmail.com
Content-Type: text/plain; charset=ISO-8859-1

On Wed, May 2, 2012 at 6:52 AM,
mic...@michaelblizek.twilightparadox.com wrote:
 Hi!

 On 09:29 Tue 01 May ? ? , Bill Traynor wrote:
 I have an Editor account for the kernelnewbies.org wiki, but all pages
 are currently immutable. ?Was the wiki made read-only at some point?

 You must be on http://kernelnewbies.org/EditorsGroup to make changes. This
 was made for spam

RE: identity mapped paging (Vaibhav Jain)

2012-04-18 Thread Pranay Kumar Srivastava

 -Original Message-
 From: Vaibhav Jain [mailto:vjoss...@gmail.com]
 Sent: Wednesday, April 18, 2012 3:49 AM
 To: Pranay Kumar Srivastava
 Cc: kernelnewbies@kernelnewbies.org
 Subject: Re: identity mapped paging (Vaibhav Jain)

 On Tue, Apr 17, 2012 at 3:46 AM, Pranay Kumar Srivastava
 pranay.shrivast...@hcl.com wrote:

  -Original Message-
  From: Vaibhav Jain [mailto:vjoss...@gmail.com]
  Sent: Tuesday, April 17, 2012 4:07 PM
  To: Pranay Kumar Srivastava
  Cc: kernelnewbies@kernelnewbies.org
  Subject: Re: identity mapped paging (Vaibhav Jain)

  On Fri, Apr 13, 2012 at 2:15 AM, Vaibhav Jain vjoss...@gmail.com
  wrote:

   I am not clear about the use of identity mapped paging while paging
  is
   being enabled by the operating system. Also I don't understand at
  what
   point are the
   identity mappings no longer useful.According to this article
   http://geezer.osdevbrasil.net/osd/mem/index.htm#identity - The
 page
   table
   entries used to identity-map kernel memory can be deleted once
 paging
   and
   virtual addresses are enabled. Can somebody please explain?

  Identity mapping is when VA(Virt Address)=PA(Physical address).

  So basically when you set up your page tables you need to make sure
  they map identically. This is very easily done if you consider each
 4KB
  block as a page beginning from location 0 upto whatever you've found
 to
  be the highest memory available either thru BIOS or GRUB.

  Remember that while setting up your PTEs and PDE every address is a
  physical one. So if you thought that your kernel would be linked
  initially to a higher VA since you would remap it to a lower memory
  physically then that would be WRONG!. Without PTEs and PDEs installed
  don't do that!.

  Why would you want it? Well for a simple reason, when your kernel
  starts to boot there's no translator,(No PTEs/PDEs and the Paging
  Enabled bit of processor is also cleared AFAIK just after the BIOS is
  done), yet since you've not enabled your processor for that but
 you'll
  be doing that in a moment.

  So let's say you made your kernel to be linked to higher VA like
 3Gigs.
  Now the addresses would be generated beginning 3Gigs however you
 still
  don't have the Page tables installed since your kernel just started.
 So
  in that case the address is the physical address. And if you've not
  loaded your kernel beginning 3Gigs then it would definitely come
  crashing down.

  To avoid the crash in case you made your kernel to link to higher
 half
  of the memory, you can use GDT trick since segmentation is always on
  and you can make the overflow of the address addition to translate to
 a
  lower physical memory even if paging is not enabled yet. Thus it is
  possible to load the kernel at lower memory addresses while the
 linkage
  would be for higher VMA. And once your PTEs/PGD are enabled then you
  can use those instead of the GDT trick.

  Here's a link to that http://wiki.osdev.org/Higher_Half_With_GDT

   Thanks
   Vaibhav Jain

  Hi,

  Thanks for replying but I am still confused. I continued reading
 about
  this thing and what
  I have understood is the following :
  After the kernel executes the instruction to enable paging the
  instruction pointer will contain the
  address of the next instruction which will now be treated as a
 virtual
  address. So for the next instruction to be executed
  the page table should map this address to itself.
  Please correct me if I am wrong.
  I am confused by the point about linking  the kernel to higher
 address.
  Could you please put that in a step by step manner
  to make it clear what  happens before paging is enabled and what
  happens after that.
  Also, please explain at what point during the execution of kernel
 code
  are the identity-mapped addresses no longer useful ?

  Thanks
  Vaibhav
  Hi,

  I am somewhat understanding your point. But I have some other queries
  now in my mind.

  If the kernel is linked to 3Gigs is there a way other than the GDT
  trick.?
 Make your load address = VA when you link so you won't have to worry
 about doing the GDT trick.

  In fact I am wondering that if the kernel is linked to 3Gigs and Grub
  loads it at 1MB physical, how will even the first instruction of
 kernel
  execute ?  I mean if all the address generated by kernel are above 3
  Gigs and paging is not enabled how will it start
  running ?
 That's what the GDT trick is for. If you read the intel/amd processor
 manuals the segmentation is always on. So when the address get
 generated your segment's base address is still added to the generated
 address before it is put on wire. You can add a constant offset (in
 your GDT's base address part) to the generated address to get the
 address beginning from the load address of your kernel.

 I would suggest you make the higher half kernel later and try to first
 create some code that can fragment your available memory into pages and
 store

RE: identity mapped paging (Vaibhav Jain)

2012-04-17 Thread Pranay Kumar Srivastava

 -Original Message-
 From: Vaibhav Jain [mailto:vjoss...@gmail.com]
 Sent: Tuesday, April 17, 2012 4:07 PM
 To: Pranay Kumar Srivastava
 Cc: kernelnewbies@kernelnewbies.org
 Subject: Re: identity mapped paging (Vaibhav Jain)

 On Fri, Apr 13, 2012 at 2:15 AM, Vaibhav Jain vjoss...@gmail.com
 wrote:

  I am not clear about the use of identity mapped paging while paging
 is
  being enabled by the operating system. Also I don't understand at
 what
  point are the
  identity mappings no longer useful.According to this article
  http://geezer.osdevbrasil.net/osd/mem/index.htm#identity - The page
  table
  entries used to identity-map kernel memory can be deleted once paging
  and
  virtual addresses are enabled. Can somebody please explain?

 Identity mapping is when VA(Virt Address)=PA(Physical address).

 So basically when you set up your page tables you need to make sure
 they map identically. This is very easily done if you consider each 4KB
 block as a page beginning from location 0 upto whatever you've found to
 be the highest memory available either thru BIOS or GRUB.

 Remember that while setting up your PTEs and PDE every address is a
 physical one. So if you thought that your kernel would be linked
 initially to a higher VA since you would remap it to a lower memory
 physically then that would be WRONG!. Without PTEs and PDEs installed
 don't do that!.

 Why would you want it? Well for a simple reason, when your kernel
 starts to boot there's no translator,(No PTEs/PDEs and the Paging
 Enabled bit of processor is also cleared AFAIK just after the BIOS is
 done), yet since you've not enabled your processor for that but you'll
 be doing that in a moment.

 So let's say you made your kernel to be linked to higher VA like 3Gigs.
 Now the addresses would be generated beginning 3Gigs however you still
 don't have the Page tables installed since your kernel just started. So
 in that case the address is the physical address. And if you've not
 loaded your kernel beginning 3Gigs then it would definitely come
 crashing down.

 To avoid the crash in case you made your kernel to link to higher half
 of the memory, you can use GDT trick since segmentation is always on
 and you can make the overflow of the address addition to translate to a
 lower physical memory even if paging is not enabled yet. Thus it is
 possible to load the kernel at lower memory addresses while the linkage
 would be for higher VMA. And once your PTEs/PGD are enabled then you
 can use those instead of the GDT trick.

 Here's a link to that http://wiki.osdev.org/Higher_Half_With_GDT

  Thanks
  Vaibhav Jain

 Hi,

 Thanks for replying but I am still confused. I continued reading about
 this thing and what
 I have understood is the following :
 After the kernel executes the instruction to enable paging the
 instruction pointer will contain the
 address of the next instruction which will now be treated as a virtual
 address. So for the next instruction to be executed
 the page table should map this address to itself.
 Please correct me if I am wrong.
 I am confused by the point about linking  the kernel to higher address.
 Could you please put that in a step by step manner
 to make it clear what  happens before paging is enabled and what
 happens after that.
 Also, please explain at what point during the execution of kernel code
 are the identity-mapped addresses no longer useful ?

 Thanks
 Vaibhav
 Hi,

 I am somewhat understanding your point. But I have some other queries
 now in my mind.

 If the kernel is linked to 3Gigs is there a way other than the GDT
 trick.?

Make your load address = VA when you link so you won't have to worry about 
doing the GDT trick.

 In fact I am wondering that if the kernel is linked to 3Gigs and Grub
 loads it at 1MB physical, how will even the first instruction of kernel
 execute ?  I mean if all the address generated by kernel are above 3
 Gigs and paging is not enabled how will it start
 running ?

That's what the GDT trick is for. If you read the intel/amd processor manuals 
the segmentation is always on. So when the address get generated your segment's 
base address is still added to the generated address before it is put on wire. 
You can add a constant offset (in your GDT's base address part) to the 
generated address to get the address beginning from the load address of your 
kernel.

I would suggest you make the higher half kernel later and try to first create 
some code that can fragment your available memory into pages and store this 
information so you'll know what all pages are there. Next would be to do 
identity mapping, since your kernel VMA=LMA in your linker script this would be 
easier to do.

When you get that paging enabled you can move on to higher half kernel. I would 
suggest you to work on page replacement algos and virtual memory management 
code side by side for better integration with paging in later stages.

Maybe you can post your code if you are allowed

RE: Query on linker scripts

2012-03-26 Thread Pranay Kumar Srivastava

 -Original Message-
 From: Vaibhav Jain [mailto:vjoss...@gmail.com]
 Sent: Sunday, March 25, 2012 3:19 AM
 To: Pranay Kumar Srivastava
 Subject: Re: Query on linker scripts

 Hi Pranay,
 Thanks for replying!. I am still not clear about this as I have not
 reached
 the part of the tutorial which talks about pte and pgd. Could you
 please explain this point about safety of section with a simpler
 example?

I'll take example from your script.
.bss :
{
  sbss = .;
  *(COMMON)
  *(.bss)
  ebss = .;
}

What I wanted to say was instead of taking ebss within .bss section you should 
take it outside that section. You might need to do ABSOLUTE since . will give 
you relative values but you'd want absolute values since addresses that you are 
interested in will begin from the entry point not from a section. So you can 
try something like this

.bss ALIGN(4096):
{
  sbss = .;
  *(COMMON)
  *(.bss)
}
ebss = ABSOLUTE(.); /*This should be a page aligned address*/

 Also, from your reply I figured out that it is not compulsory
 to define such symbols and the names can be different than sbss and
 ebss. Am I right ?

The names can be anything it's your choice entirely. But being descriptive 
helps. You should have a close look at the redhat tutorial for linker scripts 
instead of following someone else's linker script since you might not require 
all the variables chosen or you might need some additional variables due to the 
design you've chosen for your kernel.

http://docs.redhat.com/docs/en-US/Red_Hat_Enterprise_Linux/4/html/Using_ld_the_GNU_Linker/simple-example.html

 Thanks
 Vaibhav Jain

Please include kernelnewbies@kernelnewbies.org in cc when replying. You are 
likely to get more responses that way.

 On Sat, Mar 24, 2012 at 11:45 AM, Pranay Kumar Srivastava
 pranay.shrivast...@hcl.com wrote:
 On 03/24/2012 11:52 PM, Pranay Kumar Srivastava wrote:

  From: kernelnewbies-
 bounces+pranay.shrivastava=hcl@kernelnewbies.org [kernelnewbies-
 bounces+pranay.shrivastava=hcl@kernelnewbies.org] On Behalf Of
 kernelnewbies-requ...@kernelnewbies.org [kernelnewbies-
 requ...@kernelnewbies.org]
  Sent: Saturday, March 24, 2012 9:30 PM
  To: kernelnewbies@kernelnewbies.org
  Subject: Kernelnewbies Digest, Vol 16, Issue 29

  Send Kernelnewbies mailing list submissions to
           kernelnewbies@kernelnewbies.org

  To subscribe or unsubscribe via the World Wide Web, visit

  http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
  or, via email, send a message with subject or body 'help' to
           kernelnewbies-requ...@kernelnewbies.org

  You can reach the person managing the list at
           kernelnewbies-ow...@kernelnewbies.org

  When replying, please edit your Subject line so it is more specific
  than Re: Contents of Kernelnewbies digest...

  Today's Topics:

      1. Query on linker scripts (Vaibhav Jain)
      2. Re: Query on linker scripts (Carlo Caione)

  -
 -

  Message: 1
  Date: Fri, 23 Mar 2012 21:43:40 -0700
  From: Vaibhav Jainvjoss...@gmail.com
  Subject: Query on linker scripts
  To: kernelnewbies@kernelnewbies.org
  Message-ID:

  CAKuUYSw=_zzykpwetbjsgeyppsrowk+whm0o5l_pncmanvc...@mail.gmail.com
  Content-Type: text/plain; charset=iso-8859-1

  Hi,

  Recently I have started reading tutorials for writing a small kernel.
 All
  such tutorials mention use of linker scripts. I have
  read few articles on linker scritps but I am stuck on one thing. I am
  unable to understand the use of defining new symbols in linker
 scripts.
  Using a linker script to arrange different sections in the object
 file is
  understandable but defining symbols which are not referenced anywhere
 in
  the script
  is confusing. An example is the use of symbols sbss and ebss in the
 bss
  section as show in the script below

  ENTRY (loader)
  SECTIONS
  {
       . = 0x0010;
       .text ALIGN (0x1000) :
       {
           *(.text)
       }
       .rodata ALIGN (0x1000) :
       {
           *(.rodata*)
       }
       .data ALIGN (0x1000) :
       {
           *(.data)
       }
       .bss :
       {
           sbss = .;
           *(COMMON)
           *(.bss)
           ebss = .;
       }
  }

  Please explain how defining such symbols is useful.

  Thanks
  Vaibhav Jain
  -- next part --
  An HTML attachment was scrubbed...
  URL:
 http://lists.kernelnewbies.org/pipermail/kernelnewbies/attachments/2012
 0323/6e1741da/attachment-0001.html

  --

  Message: 2
  Date: Sat, 24 Mar 2012 16:26:38 +0100
  From: Carlo Caionecarlo.cai...@gmail.com
  Subject: Re: Query on linker scripts
  To: Vaibhav Jainvjoss...@gmail.com
  Cc: kernelnewbies@kernelnewbies.org
  Message-ID:4f6de7ae.9070...@gmail.com
  Content-Type: text/plain; charset=ISO-8859-1

RE: Query on linker scripts

2012-03-26 Thread Pranay Kumar Srivastava

 -Original Message-
 From: Vaibhav Jain [mailto:vjoss...@gmail.com]
 Sent: Monday, March 26, 2012 4:31 PM
 To: Pranay Kumar Srivastava
 Cc: kernelnewbies@kernelnewbies.org
 Subject: Re: Query on linker scripts

 Thanks a lot for the explanation and the link!! I have just one more
 question about
 linker scripts. I am not clear about alignment of sections. Is it
 necessary to align sections?

Its not mandatory but helps you to know in terms of pages how many are in use, 
otherwise while creating ptes for the kernel you'll need to calculate from the 
number of bytes the kernel has consumed and find out pages required. Otherwise 
its just a simple shift operation to find out the corresponding page from the 
page's beginning address.

 Can the alignment be different from 4096?
 In the script that I provided the text and data sections are aligned
 while the bss section is not. Is there
 a reason for it ?

The alignment generally corresponds to page granularity in the case of linker 
scripts. It's not about aligning data in C which can be done on say 4byte 8 
byte etc... This alignment is done so that when the time comes to protect 
kernel and start user space applications you'll know which pages corresponds to 
kernel and can't be swapped and must be protected. Instead of thinking about 
protecting how many bytes kernel uses, think about how many pages the kernel 
uses.

Well, bss isn't part of your executable. It's stack and hence it doesn't make 
sense to have ALIGN there since you'll be responsible of setting up the stack 
for the kernel and using it. The data/text are part of the executable so you 
would want to have them aligned on page boundary so that when you load the 
kernel those sections would be loaded at page aligned address and end at page 
boundary.

 Thanks
 Vaibhav Jain

 On Sun, Mar 25, 2012 at 11:41 PM, Pranay Kumar Srivastava
 pranay.shrivast...@hcl.com wrote:

  -Original Message-
  From: Vaibhav Jain [mailto:vjoss...@gmail.com]
  Sent: Sunday, March 25, 2012 3:19 AM
  To: Pranay Kumar Srivastava
  Subject: Re: Query on linker scripts

  Hi Pranay,
  Thanks for replying!. I am still not clear about this as I have not
  reached
  the part of the tutorial which talks about pte and pgd. Could you
  please explain this point about safety of section with a simpler
  example?
 I'll take example from your script.
 .bss :
 {
          sbss = .;
          *(COMMON)
          *(.bss)
          ebss = .;
 }
 What I wanted to say was instead of taking ebss within .bss section you
 should take it outside that section. You might need to do ABSOLUTE
 since . will give you relative values but you'd want absolute values
 since addresses that you are interested in will begin from the entry
 point not from a section. So you can try something like this

 .bss ALIGN(4096):
 {
          sbss = .;
          *(COMMON)
          *(.bss)
 }
 ebss = ABSOLUTE(.); /*This should be a page aligned address*/

  Also, from your reply I figured out that it is not compulsory
  to define such symbols and the names can be different than sbss and
  ebss. Am I right ?
 The names can be anything it's your choice entirely. But being
 descriptive helps. You should have a close look at the redhat tutorial
 for linker scripts instead of following someone else's linker script
 since you might not require all the variables chosen or you might need
 some additional variables due to the design you've chosen for your
 kernel.

 http://docs.redhat.com/docs/en-
 US/Red_Hat_Enterprise_Linux/4/html/Using_ld_the_GNU_Linker/simple-
 example.html

  Thanks
  Vaibhav Jain

 Please include kernelnewbies@kernelnewbies.org in cc when replying. You
 are likely to get more responses that way.

  On Sat, Mar 24, 2012 at 11:45 AM, Pranay Kumar Srivastava
  pranay.shrivast...@hcl.com wrote:
  On 03/24/2012 11:52 PM, Pranay Kumar Srivastava wrote:

   From: kernelnewbies-
  bounces+pranay.shrivastava=hcl@kernelnewbies.org [kernelnewbies-
  bounces+pranay.shrivastava=hcl@kernelnewbies.org] On Behalf Of
  kernelnewbies-requ...@kernelnewbies.org [kernelnewbies-
  requ...@kernelnewbies.org]
   Sent: Saturday, March 24, 2012 9:30 PM
   To: kernelnewbies@kernelnewbies.org
   Subject: Kernelnewbies Digest, Vol 16, Issue 29

   Send Kernelnewbies mailing list submissions to
            kernelnewbies@kernelnewbies.org

   To subscribe or unsubscribe via the World Wide Web, visit

   http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
   or, via email, send a message with subject or body 'help' to
            kernelnewbies-requ...@kernelnewbies.org

   You can reach the person managing the list at
            kernelnewbies-ow...@kernelnewbies.org

   When replying, please edit your Subject line so it is more specific
   than Re: Contents of Kernelnewbies digest...

   Today's Topics:

       1. Query on linker scripts (Vaibhav Jain

Can't get Major and Minor number for device correctly.

2012-02-29 Thread Pranay Kumar Srivastava

Hi,

I've a kernel module which needs to get hold of the gen_disk structure. To make 
it a bit interactive I've a user space program which uses stat sys call to get 
the respective device's dev_t number (I use s_rdev of the struct stat) and pass 
it on to my module via an ioctl call. This part works good.

When my module tries to use the get_gendisk function it returns NULL. I printed 
out the MAJOR(dev_t) and MINOR(dev_t) and was surprised to get the value 0 for 
MAJOR(dev_t). A little more digging got me to this code snippet, the file is 
kdev_t.h

---%

#ifdef __KERNEL__
#define MINORBITS   20
#define MINORMASK   ((1U  MINORBITS) - 1)

#define MAJOR(dev)  ((unsigned int) ((dev)  MINORBITS))
#define MINOR(dev)  ((unsigned int) ((dev)  MINORMASK))
#define MKDEV(ma,mi)(((ma)  MINORBITS) | (mi))
.

#else /* __KERNEL__ */

/*
Some programs want their definitions of MAJOR and MINOR and MKDEV
from the kernel sources. These must be the externally visible ones.
*/
#define MAJOR(dev)  ((dev)8)
#define MINOR(dev)  ((dev)  0xff)
#define MKDEV(ma,mi)((ma)8 | (mi))
#endif /* __KERNEL__ */

---%

Since __KERNEL__ is defined in the topmost Makefile, the first definition of 
macros would be used which is giving me the wrong result. But what I really 
want is the #else one.

Surprisingly doing a major and minor in user space works. I traced it back to 
the following code snippet, in file /usr/include/sys/sysmacros.h

---%

# if defined __GNUC__  __GNUC__ = 2  defined __USE_EXTERN_INLINES
__extension__ __extern_inline unsigned int
__NTH (gnu_dev_major (unsigned long long int __dev))
{
  return ((__dev  8)  0xfff) | ((unsigned int) (__dev  32)  ~0xfff);
}

__extension__ __extern_inline unsigned int
__NTH (gnu_dev_minor (unsigned long long int __dev))
{
  return (__dev  0xff) | ((unsigned int) (__dev  12)  ~0xff);
}

#endif

# define major(dev) gnu_dev_major (dev)
# define minor(dev) gnu_dev_minor (dev)

---%

So basically the user space code doing stat and then major/minor works because 
its doing the dev_t8.

My questions are,

 1. How does MAJOR, MINOR are able to work within kernel with 32 bit 
definitions?, since dev_t20 would make the major number 0 and get_gendisk 
should then fail and most of the devices listed like my disks don't report a 
quite big dev_t when I do stat on them.

 2. Should I be checking for wether or not 32 bits dev_t is in effect? But if 
I've to do this what's the purpose of using the macros MAJOR and MINOR? Aren't 
these there to do the same job?

As an example, doing stat reported stat.s_rdev field to be 2065, which 
accordingly translate to 8,17.


I'm using SLES 11 SP1, kernel 2.6.32



Regards,
Pranay Kr. Srivastava
pranay.shrivast...@hcl.com
Software Engineer
ERS,HCL Technologies
A-5, Sector 24, Noida 201301, U.P. (India)



::DISCLAIMER::
---

The contents of this e-mail and any attachment(s) are confidential and intended 
for the named recipient(s) only.
It shall not attach any liability on the originator or HCL or its affiliates. 
Any views or opinions presented in
this email are solely those of the author and may not necessarily reflect the 
opinions of HCL or its affiliates.
Any form of reproduction, dissemination, copying, disclosure, modification, 
distribution and / or publication of
this message without the prior written consent of the author of this e-mail is 
strictly prohibited. If you have
received this email in error please delete it and notify the sender 
immediately. Before opening any mail and
attachments please check them for viruses and defect.

---

___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies

Using Sysfs uevents.

2012-02-15 Thread Pranay Kumar Srivastava

Hi,

I was playing with sysfs and I'm able to create kset and kobjects within them 
as well. I need to know how do I use the uevents of these kobjects that I 
create. For example while reading the code I found that certain events like 
ADD, DEL a couple more were there are apparently fired. Now currently I'm not 
handling these events, the ops field is null, so they don't bother me hence 
they are not mandatory?

If I were to actually do something with these events what it should be? Since 
my module runs fine and the uevents are supposed to be for userland 
applications (Hotplug) but the point is again how will a userspace application 
get to know about it? Does the application needs to create netlink sockets for 
it? If it does then why bother with the uevents of kobject?




::DISCLAIMER::
---

The contents of this e-mail and any attachment(s) are confidential and intended 
for the named recipient(s) only.
It shall not attach any liability on the originator or HCL or its affiliates. 
Any views or opinions presented in
this email are solely those of the author and may not necessarily reflect the 
opinions of HCL or its affiliates.
Any form of reproduction, dissemination, copying, disclosure, modification, 
distribution and / or publication of
this message without the prior written consent of the author of this e-mail is 
strictly prohibited. If you have
received this email in error please delete it and notify the sender 
immediately. Before opening any mail and
attachments please check them for viruses and defect.

---

___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies

Working with kernel_accept?

2012-02-01 Thread Pranay Kumar Srivastava

I was trying to create socket within kernel and I used kernel_* helper 
functions to get started. This worked fine for UDP however with TCP I ran into 
some issues when I did the following steps

1.) I was able to create a listening TCP socket using sock_create_kern, should 
I be using sock_create only?

2.) I had changed the sk_data_ready callback for the listening socket so that a 
waiting thread would be notified when a connection is ready to be accepted. 
When that happened the thread was woken up and that thread then called 
kernel_accept.

3.) Now started the issue, in kernel_accept it uses sock_create_lite and the 
machine just froze. After quite a lot of hours, i was able to figure out the 
problem which was apparently with sock_create_lite. This function was not 
initializing sock-sk, printed it and found it to be NULL, which I guess caused 
the machine to froze.

4.) As a resolve, I went back to sock_create_kern and called sock-ops-accept 
instead of kernel_accept and it worked.

Is there any other step required in order to work with kernel_accept?
I'm using SLES 11 SP1, kernel 2.6.32.12


::DISCLAIMER::
---

The contents of this e-mail and any attachment(s) are confidential and intended 
for the named recipient(s) only.
It shall not attach any liability on the originator or HCL or its affiliates. 
Any views or opinions presented in
this email are solely those of the author and may not necessarily reflect the 
opinions of HCL or its affiliates.
Any form of reproduction, dissemination, copying, disclosure, modification, 
distribution and / or publication of
this message without the prior written consent of the author of this e-mail is 
strictly prohibited. If you have
received this email in error please delete it and notify the sender 
immediately. Before opening any mail and
attachments please check them for viruses and defect.

---

___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies

When to use blk_end_request_* routines.

RE:Query on skb buffer (Kumar amit mehta)

RE: Major/minor numbers

RE: Major/minor numbers

RE: barrier()

RE: Where does kernel store per task file position?

How to Faking a PCI or USB device.

RE: How to Faking a PCI or USB device.

Query regarding some blk_queue_XXX functions. (For kernel 2.6.32 SLES 11)

Re: Sysfs class attribute problem

kernel_sendpage query.

RE: Crash when sending a lot of messages through a unix socket

RE: identity mapped paging (Vaibhav Jain)

RE: identity mapped paging (Vaibhav Jain)

RE: Query on linker scripts

RE: Query on linker scripts

Can't get Major and Minor number for device correctly.

Using Sysfs uevents.

Working with kernel_accept?

19 matches

Site Navigation

Mail list logo

Footer information