It is often difficult or impossible to tell how a spammer acquired a user's e-mail address. Is it a result of some activity the user engaged in? Did the user give his/her e-mail address to the wrong person? Is the user randomly targeted? Are there steps the user could take to avoid such spam in the future? As a result, consumers who use email are exposed to a variety of spam - including objectionable messages - no matter the source of the address.This study attempts to answer some of these questions by analyzing common activities of Internet users and looking for evidence of some activities that resulted in one e-mail address receiving more spam than others. Armed with lists of e-mail addresses, "spammers" send billions of e-mail messages every day -- messages that most users don't want.Through this investigation it is indicated that email address harvesting usually is automated, because spam can hit the addresses soon after they are used publicly the first time; the spam was not targeted; and some addresses were picked up off web pages even when they weren't visible to the eye. Still, I would say said consumers can protect their email addresses from harvesting programs.

Problem

There are many ways in which spammers can get email address. The ones commonly used are:From posts to UseNet with your email address.Spammers regularly scan UseNet for email address, using ready made programs designed to do so. Some programs just look at articles headers which contain email address (From: Reply-To: etc), while other programs check the articles' bodies, starting with programs that look at signatures, through programs that take everything that contain a '@' character.As people who where spammed frequently report that spam frequency to their mailbox dropped sharply after a period in which they did not post to UseNet, as well as evidence to spammers' chase after 'fresh' and 'live' addresses, this technique seems to be the primary source of email addresses for spammers.

1. From mailing lists.Spammers regularly attempt to get the lists of subscribers to mailing lists knowing that the email addresses are unmunged and that only a few of the addresses are invalid.When mail servers are configured to refuse such requests, another trick might be used - spammers might send an email to the mailing list with the headers Return-Receipt-To: or X-Confirm-Reading-To: . Those headers would cause some mail transfer agents and reading programs to send email back to the saying that the email was delivered to / read at a given email address, divulging it to spammers.A different technique used by spammers is to request a mailing lists server to give him the list of all mailing lists it carries (an option implemented by some mailing list servers for the convenience of legitimate users), and then send the spam to the mailing list's address, leaving the server to do the hard work of forwarding a copy to each subscribed email address.

2. From web pages.Spammers have programs which spider through web pages, looking office hoteling reservation system email addresses, e.g. email addresses contained in mailto: HTML tags [those you can click on and get a mail window opened]Some spammers even target their mail based on web pages. I have discovered a web page of mine appeared in Yahoo as some spammer harvested email addresses from each new page appearing in Yahoo and sent me a spam regarding that web page.A widely used technique to fight this technique is the 'poison' CGI script. The script creates a page with several bogus email addresses and a link to itself. Spammers' software visiting the page would harvest the bogus email addresses and follow up the link, entering an infinite loop polluting their lists with bogus email addresses.

3. From various web and paper forms.Some sites request various details via forms, e.g. guest books & registrations forms. Spammers can get email addresses from those either because the form becomes available on the World Wide Web, or because the site sells / gives the emails list to others.Some companies would sell / give email lists filled in on paper forms, e.g. organizers of conventions would make a list of participants' email addresses, and sell it when it's no longer needed.Some spammers would actually type E-mail addresses from printed material, e.g. professional directories & conference proceedings.Domain name registration forms are a favorite as well - addresses are most usually correct and updated, and people read the emails sent to them expecting important messages.

4. From a web browser.Some sites use various tricks to extract a surfer's email address from the web browser, sometimes without the surfer noticing it. Those techniques include:1. Making the browser fetch one of the page's images through an anonymous FTP connection to the site.Some browsers would give the email address the user has configured into the browser as the password for the anonymous FTP account. A surfer not aware of this technique will not notice that the email address has leaked.2. Using JavaScript to make the browser send an email to a chosen email address with the email address configured into the browser.Some browsers would allow email to be sent when the mouse passes over some part of a page. Unless the browser is properly configured, no warning will be issued.3. Using the HTTP_FROM header that browsers send to the server.Some browsers pass a header with your email address to every web server you visit. It's worth noting here that when one reads E-mail with a browser (or any mail reader that understands HTML), the reader should be aware of active content (Java applets, JavaScript, VB, etc) as well as web bugs.An E-mail containing HTML may contain a script that upon being read (or even the subject being highlighted) automatically sends E-mail to any E-mail addresses. A good example of this case is the Melissa virus.

5. From IRC and chat rooms.Some IRC clients will give a user's email address to anyone who cares to ask it. Many spammers harvest email addresses from IRC, knowing that those are 'live' addresses and send spam to those email addresses.This method is used beside the annoying IRC bots that send messages interactively to IRC and chat rooms without attempting to recognize who is participating in the first place.This is another major source of email addresses for spammers, especially as this is one of the first public activities newbie's join, making it easy for spammers to harvest 'fresh' addresses of people who might have very little experience dealing with spam.AOL chat rooms are the most popular of those - according to reports there's a utility that can get the screen names of participants in AOL chat rooms. The utility is reported to be specialized for AOL due to two main reasons - AOL makes the list of the actively participating users' screen names available and AOL users are considered prime targets by spammers due to the reputation of AOL as being the ISP of choice by newbie's.