
Soundex Coding System
![]()
Soundex Coding System
submited by R. Kyger
A soundex code is a four character representation based on the way a name sounds rather than the way it is spelled. Theoretically, using this system, you should be able to index a name so that it can be found no matter how it was spelled. The system was developed by Margaret K. Odell and Robert C. Russell (see U.S. Patents 1261167 <1918> and 1435663 <1922>).
The WPA used the soundex coding system in the 1930s to do a partial indexing on 3x5 cards of the 1880 (all households with a child age 10 or younger) and 1900 censuses and a nearly full indexing of the 1910 (not all states completed) and 1920 (some states completed) censuses. The soundex indexes of the 1880, 1900 and 1910 census records are available on microfilm at the National Archives (and its branches) and many libraries or other archives. These microfilms also can be purchased from the National Archives. The names are arranged on the soundex indexes by first letter, then numerically within that letter, then alphabetically by the first name of the head of household within each different soundex code.
There is usually a separate card for each individual within the household whose surname is different from that of the head of household.
Besides telling where the original record can be found, the microfilmed soundex cards give basic information about each person in the household, such as place of residence, age, sex, relationship to head of household, state born, state where parents were born, etc. However, all of the information that is contained in the original census records is not included.
Every soundex code consists of a letter and three numbers, such as B525. The letter is always the first letter of the surname. The numbers are assigned this way:
To figure out a surname's code, do this:
1 = b,p,f,v
2 = c,s,k,g,j,q,x,z
3 = d,t
4 = l
5 = m,n
6 = r
disregard - a,e,i,o,u,w,y,h
KYGER
Eliminate any a,e,i,o,u,w,y,h
Write the first letter, as is, followed by the codes found in the table above
KGR
No matter how long or short the surname is, the soundex code is always the first letter of the name followed by three numbers. If you have coded the first letter and three numbers but still have more letters in the name, ignore them. If you have run out of letters in the name before you have three numbers, then add zeroes to the code
KGR = K260
Prefixes: If you have a surname with a prefix like Van, Von, De, Di, or Le, code it with and without the prefix because it may be listed under either code. Van Hoesen could be coded as VanHoesen or as Hoesen. Mac and Mc are NOT considered prefixes. Double letters: Any double letters side by side should be treated as one letter. For example LLOYD is coded as if it were spelled LOYD. GUTIERREZ is coded as if it were GUTIEREZ.
WASHINGTON = WSNGTN = W252
(ignore the ending TN)
KUHNE = KN = K500
(add zeroes to the end)
Side by side letters with the same value: You may have different letters side by side that have the same code value. For example PFISTER (P & F are both 1), JACKSON (CKS are all 2). These letters should be treated as one letter. PFISTER is coded as PSTR (P236) and JACKSON is coded as JCN (J250).
Thus, variations in spellings or mispellings should produce the same code number:
Note, however, that some names which are pronounced essentially the same produce different codes. An example is the "tz" sound in German names, which is normally pronounced the same as "ce" or "se." Also, the German "B" is often pronounced as the English "P." Thus the German name Bentz could be spelled that way or as Benz, Bens, Bents, Bennss, Bense, Bennss, Bants and Banz, or as Penz, Pentz, Pence, Pens, Pense, Penz, Pents, Penns, Pense, Penze, Pentze, etc.
SMITH = S530 SMITHE = S530
SMYTH = S530 SMYTHE = S530
Indeed, it has been found in census record indexes under all of these - and more. Remember: Those making the index have as hard a time reading the handwriting of census takers as we do. They will sometimes mistake a script "z" as a "y" and record Penty instead of Pentz, or mistake a "c" for an "e" and record Penee, for examples. Therefore, to make sure you don't miss finding your ancestor, you may have to look under a half dozen or more different soundex codes:
Think through the possible variant spellings (and misspellings and misreadings) of the surname you are searching before concluding that it can't be found in the soundex listings. Use your imagination. No mistake is beyond possibility! For instance, the name Pence has been indexed as Peirce (the reader mistook the written letter "n" for an "i-r" combination) and vice versa. The program SOUNDEX will figure the soundex code for any name.
BENTZ (and equivalents) = B532
PENTZ (and equivalents) = P532
BENZ (and equivalents) = B520
PENZ (and equivalents) = P520
BENTY (and equivalents) = B530
PENTY (and equivalents) = P530
PENEE (and equivalents) = P500
![]()
Return to Beginners' Center - Main Page
The Genealogy Forum - Main Page
Beginners' Center | File Libraries | Internet Center
Message Boards | Resource Center | Reunion Center
Genealogy Forum News
Shop With Us
© 1999 - 2004 Some Graphics By Carol, All Rights Reserved.
Genealogy Forum.com is a production of Golden Gate Services, Inc. of Franklin, Massachusetts.
© 1998 - 2004 All Rights Reserved. George Ferguson, President.
The Genealogy Forum is a member of the Federation of Genealogical Societies and the National Genealogical Society.
Tree logo provided by MeadPond Designs and is the trademark of GenealogyForum.com.
If you have any questions or comments,
please contact GenealogyForum@aol.com