IntroductionEnglish has some of the most inconsistent spelling. This makes it much more difficult, and therefore expensive, to teach. Which is unfortunate for the de facto international language. It makes communication more difficult for the entire world. And it's fixable. I suggest everyone pick more regular spellings of just a couple words that don't seem too strange to you, and use them. This way, very gradually, you can help fix the language. On this page I basically try to predict how English would naturally become more consistent on its own. This idea upsets some people. "You can't just change English!" Living languages, languages that are in active use, change. Would you prefer those changes to be entirely accidental? Wouldn't it be better if some of those changes were carefully thought out? Many constructed languages have been created in hopes of providing a somehow more useful alternative for international communication. The main problem is getting people to actually use them. English is the de facto international language, and takes a long time to learn because of how internally inconsistent it is. What if we created a language that was as easy to learn as possible, by being as internally consistent as possible, while also being as easy to read as possible for people who already know English? Make adoption easier, and encourage people to use pieces of it to make their English easier to understand, as they are comfortable. Possibly entirely, in places where people are particularly concerned about international accessibility, where things like Basic English, Special English, and Simplified English are used. It might also be useful for fiction which requires a futuristic language. |
IntradukshanInglish havs sum uv thee mowst inkansistant speling. This mayks it much mor difikalt, and theerfor ikspensiv, too teech. Wich iz unforchanit for thee [de] [facto] internashanal layngwij. It mayks kumyoonikayshan mor difikalt for thee intaier wurld. And its [fixable]. Ai sagjest evreewun pik mor regyaler spelings uv just ay kupal wurds that downt seem too straynj too yoo, and yooz them. This way, veree grajooalee, yoo kan help fiks thee layngwij. On this payj Ai baysikalee trai too pridikt how Inglish woud nacheralee bakum mor kansistant on its own. This aideea upsets sum pursans. "Yoo [can't] just chaynj Inglish!" Living layngwijs, layngwijs that ar in aktiv yooz, chaynj. Woud yoo prifur thowz chaynjs too bee intairlee aksidentl? [wouldn't] it bee beter if sum uv thowz chaynjs wur kerfalee thingkd owt? Menee kanstruktd layngwijs hav bin kreeaytd in howps uv pravaiding ay sumhow mor yoosfal awlturnativ for internashanal kumyoonikayshan. Thee mayn prablam iz geting pursans too akchooalee yooz them. Inglish iz thee [de] [facto] internashanal layngwij, and tayks ay lawng taim too lurn beekawz uv how inturnalee inkansistant it iz. Whut if wee kreeaytd ay layngwij that wuz az eezee too lurn az pasabl, bai beeing az inturnalee kansistant az pasabl, whail awlsow beeing az eezee too reed az pasabl for pursans hoo awlredee now Inglish? Mayk adaptshin eezeeer, and inkurrij pursans too yooz peess uv it too mayk theer Inglish eezeeer too understand, az thay ar kumftabal. Pasablee intairlee, in playss wheer pursans ar patikyalerlee kansurnd abowt internashanal aksesabilatee, wheer things laik Baysik Inglish, Speshal Inglish, and Simplifayd Inglish ar yoozd. It mayd awlsow bee yoosfal for fikshan wich rikwairs ay fyoocheristik layngwij. |
All of the changes I'm suggesting are changes which I believe could happen from natural regularization, given enough time. I'm not certain of any of these choices, and am interested in suggestions.
Verb (past tense), noun (plural), and spelling regularization happens naturally over time. For example, the plural of "cow" changed from "kine" to "cows", the past tense of "help" changed from "holpe" to "helped", and "plow" changed from "plough". There are many of these straight forward changes left to do.
A popular example of spelling inconsistency that is at least 140 years old: If the 'gh' sound in 'enough' is pronounced 'f', and the 'o' in 'women' makes the short 'i' sound and the 'ti' in 'nation' is pronounced 'sh' then the word 'ghoti' is pronounced just like 'fish'. Those problems can be fixed with these changes:
| enough | enouf |
| women | wimen |
| nation | nashon |
Then "fish" is pronounced like "fish".
My thoughts, so far, are:
Similar things have been done. I think it's important to mention the reason for all the decisions involved, so they can be discussed, and improved on.
For concrete results, I would like to create a few web browser spell checking dictionaries for people to use:
Here are all the IPA diaphonemes, with example words on the first line, and possible graphemes on the following lines, with my preference listed first.
Vowels
æ cat, black / trap lad bad cat
a - don't think it's worth distinguishing from ɑ:
ae - if it is worth distinguishing from ɑ:
ɑ is US for ɑ:? https://en.wiktionary.org/wiki/not#Pronunciation
ɑ: arm, father / palm father
a, same as ə?
ɒ hot, rock / lot not wasp / cloth off loss cloth long dog chocolate
o
ɔ is US for ɔ:?
ɔ: call, / thought law caught all halt talk (not using "four")
aw cawll thawt law cawt awll hawlt tawk
au caull thaut lau caut aull hault tauk
ou coull thout lou cout oull hoult touk
ol colll tholt lol colt olll hollt tolk
al call thalt lal calt all halt talk
a call that la cat all halt tak
o coll thot lo cot oll holt tok
ə away, cinema / comma about
a, same as ɑ:? (ə is the shcwa sound)
ɨ / kit spotted year
i kit spottid yir - new, not updated in dictionaries, was using "y"
y, same as ɪ,i kyt spottyd yyr - an old spelling of this sound, problematic spelling of "year" - "yyr"
e ket spotted yer
ɪ hit, sitting / sit english guitar (short i)
i, hit sitting sit inglish gitar - in use for aɪ (long i)
y, same as ɨ,i hyt syttyng syt ynglysh gytar
e, het setteng set englesh getar
i / happy city be bee
ee happee citee bee bee
i, happi citi bi bi
y, same as ɪ,ɨ happy city by by
e, happe cite be be
i: see, heat / fleece see meat
ee see heet fleece see meet
ea sea heat fleace sea meat
eɪ say, eight / face date day pain whey rein
ay say ayt fayce dayte day payn whay rayn
ai sai ait faice daite dai pain whai rain
ei sei eit feice deite dei pein whei rein
ey sey eyt feyce deyte dey peyn whey reyn
ɛ / dress bed met
e
ɜr / nurse burn herd earth bird
ur nurse burn hurd urth burd
er nerse bern herd erth berd
ear nearse bearn heard earth beard
ir nirse birn hird irth bird
ər / letter winner massacre
er letter winner massacer
ɚ weird
er
əɹ - another form of /ɚ/ according to wiktionary
er
ʌ cup, luck / strut run won flood
u cup luck strut run wun flud
ʊ put, could / foot hood book
ou pout coud fout houd bouk - used for u:
oul poult could foult hould boulk
oo poot cood foot hood book - used for u:
u may be same as u:
u: blue, food / goose through you threw yew
oo bloo food goose throo yoo throo yoo
ew blew fewd gewse threw yew threw yew
ue blue fued guese thrue yue thrue yue
aɪ five, eye / price my wise high flight mice I by like time (long i)
ai faive ai praice mai waise hai flait maice Ai bai laike taime - new, not updated in dictionaries - what does this do to words with "a" + "i" sounds? there are none!?
prais mai waiz hai flait mais Ai bai laik taim - full regularization examples with "ai"
i five i price mi wise hi flit mice I bi like time
y fyve y pryce my wyse hy flyt myce Y by lyke tyme - in use for ɨ,ɪ,i (short i), and consonant y
ii fiive ii priice mii wiise hii fliit miice Ii bii liike tiime
ay fayve ay prayce may wayse hay flayt mayce ay bay layke tayme
oy foyve oy proyce moy woyse hoy floyt moyce Oy boy loyke toyme
oi foive oi proice moi woise hoi floit moice Oi boi loike toime
oe foeve oe proece moe woese hoe floet moece Oe boe loeke toeme
oa foave oa proace moa woase hoa float moace Oa boa loake toame
ɔɪ boy, join / choice boy hoist
oi join choice boi hoist
oy joyn choyce boy hoyst
oʊ go, home / goat no toe soap tow folk soul roll cold
ow gow howme gowt now tow sowp toww fowk sowl rowll cowld
oe goe hoeme goet noe toe soep toew foek soel roell coeld
o go home got no to sop tow fok sol roll cold
oa goa hoame goat noa toa soap toaw foak soal roall coald
ol gol holme golt nol tol solp tolw folk soll rolll colld
ou gou houme gout nou tou soup touw fouk soul roull could
aʊ now, out / mouth now trout
ow owt mowth now trowt
out mouth nou trout
ɑr / start arm car
ar
ɪər / near deer here
ear near dear hear
ɛər / square mare there bear where air
air sqair mair thair bair
are sqare mare thare bare
ere sqere mere there bere
ear sqear mear thear bear
uare square muare thuare buare
ɔr / north sort warm
or north sort worm
ar narth sart warm
ɔɹ may be same as ɔər https://en.wiktionary.org/wiki/for#Pronunciation
ɔər / tore boar port
or tor bor port - same as ɔr
ʊər / cure tour moor
oar coar toar moar
our cour tour mour
ure cure ture mure
oor coor toor moor
eur ceur teur meur
jʊər / cure pure europe your
ure cure pure ureope ure
eur ceur peur europe eur
your cyour pyour yourope your
Consonants
p pet, map / pen spin tip
p
b bad, lab / but web
b
t tea, getting / two sting bet
t
ɾ alveolar flap
t
(d) may be same as d
d did, lady / do odd
d
t͡ʃ / tʃ check church / chair nature teach
ch check church chair nachure teach
c ceck curc cair nacure teac - not using it alone anywhere else?
d͡ʒ / dʒ just, large / gin joy edge language
j just larj jin joy ej languaj
k cat, back / cat kill skin queen unique thick
k kat bak kat kill skin kueen unik thik
g give, flag / go get beg
g
f find, if / fool enough leaf off photo
f find if fool enouf leaf of foto
v voice, five / voice have of
v voice five voice hav ov
θ think, both / thing teeth
th same as ð
ð this, mother / this breathe father
th same as θ
s sun, miss / see city pass mice
sun mis see sity pas mis
z zoo, lazy / zoo rose
z zoo lazy zoo roze
ʃ she, crash / she sure session emotion leash
sh she crash she shure seshon emoshon leash
ʒ pleasure, vision / pleasure beige equation seizure
zu pleazure beizu equazuon seizure
su pleasure beisu equasuon seisure
ge pleagere beige equageon seigere
ti pleatire beiti equation seitire
x / loch (Scottish) ugh
gh
h how, hello
h
m man, lemon / man ham
m
n no, ten / no tin
n
n̩ hidden
in hiddin
ŋ sing, finger / ringer sing finger drink
ng sing finger ringer sing finger dringk
ŋg
ng - this eliminates /ŋg/ -> "ngg"
l leg, little / left bell
l leg little left bel
r red, try / run very
r
w wet, window / we queen
w wet window we qween
j yes, yellow, year
y yes, yellow, year
j jes, jellow, jear - would probably be better, y always a vowel, but looks weird, and used for dʒ
x xes, xellow, xear - long term plan?
hw what
wh
I was previously using "y" for the short "i" sound, and "i" for the long "i" sound, resulting in "I" -> "I", and "English" -> "Ynglysh". And the problem of 'y' being used for both a consonant and a vowel, causing "year" to become "yyr". I changed the short 'y' to 'i', and the long 'i' to 'ai', based on it's IPA /aɪ/, resulting in "I" -> "Ai", and "English" -> "Inglish", and "year" -> "yir".
I should look at Lojikl Inglish again.
Dictionary built from wiktionary.org.
Most common 100 words.
This is the one I'm currently working on.
Phonetic spelling, and regularization of nouns and verbs. 94,327 words. Columns are:
I've made a first pass at handling regularization of nouns and verbs. It wasn't as bad as I expected. I think I have some tweaking of rules to do. (When to add "d" vs "id" for past tense verbs, etc..) The IPA column shows the IPA for the base word, not all the noun / verb forms.
The entire contents of wiktionary.org are available for download. It's not the most convenient to parse.
Problems I've found parsing wiktionary.
Parsing wiktionary is inconvenient, dealing with stressed vs. unstressed, and senses. But I've made good progress on it.
To do:
Dictionary built using FEWL. It handles noun plurals, and verb past tenses. I stopped using it because wiktionary has much more data on irregular nouns and verbs, and more words. Columns are:
I'm pretending "data", "those", and "these" are not plural. I also haven't handled be/am/is/are,was/were,been. Ew. Perhaps one (set of) exception(s) is reasonable. Be / beed?
"Myself" gets a hyphen because the author of FEWL doesn't indicate the difference between words that were originally hyphenated, and compounded words.
Words missing from FEWL: states caused bothered converting sounds definitions letters handled emailed noticed apostrophes handling contractions
FEWL is useful because it provides pronunciation, noun plural, and verb past tense info. But I could get the same stuff by downloading wiktionary. Which would probably give me more words.
Old dictionary built using cmudict, lacking handling of verb past tenses and noun plurals.
Be/am/is/are/was/were/been may be the greatest example of how much of a mess English is. Seven distinct spellings for what is effectively a single word. This is exactly why English takes so much longer to learn to read than other languages. It's also not surprising that this word in particular is so problematic, because research has shown that more commonly used words take longer to naturally become regularized. This also provides me with a great example to use in detailing how I encourage you to use as much or as little of my spelling methods as you're comfortable with. This is going to look weird. Keep in mind that just as it now seems odd that the past tense of "help" was once "holpe" before it was naturally regularized into "helped", if English remains in use long enough, people will one day wonder what madness caused us to use seven apparently random different spellings of the same word. Also note that I'm not changing the pronunciation of the word "be", just its spelling.
| Leave them alone | will be | I am | it is | you are | she was | they were | it has been |
| Consistent spelling | will by | I am | it yz | you ar | she wuz | they wur | it has byn |
| Reduce to three words | will by | I by | it by | you by | she wuz | they wuz | it has byn |
| Full regularization | will by | I by | it by | you by | she byd | they byd | it has byd |
This causes problems with the contraction "I'm".
The language toki pona doesn't include these words at all.
Looking up nite made me realize I should scrape wiktionary for more common phonetic spellings of words.
One of the disadvantages of phonetic spelling is that you lose etymology info. I believe improved ease of international communication is far more valuable. But I think this video is an excellent example of the disadvantages of keeping the spelling that provides etymological clues. It is a French word. When it was adopted by English, the French spelling was kept. As a result, it has been prounced with English phonetics as long as these things have been written down. It is pronounced "wrong" because when it was adopted, phonetic spelling was not used.
I haven't handled cases where multiple words are spelled the same but pronounced different (heteronym homographs). I'd like to spell them differently. Like "read".
More extreme possibilities I can think of are a couple cases where this language could be made more like Lojban, while remaining mutually intelligible with English: 1) Lojban doesn't have verbs and nouns, it has primitives which are converted to verbs and nouns by adding suffixes. Perhaps this way we could eliminate some redundant words. 2) Lojban grammar / sentence structure? ("First they came for the verbs, and I said nothing because verbing weirds language. Then they arrival for the nouns, and I speech nothing because I no verbs.")
What else would make useful improvements? I think I've read that poets would love if "you" and "I"/"me" rhymed.
I should try typing with diacritics, maybe it's not as bad as I think, and they'd improve the number of phonemes available without using multiple letters.
I would strongly support adding new words from other languages, by mapping them to the provided phonemes.
I'd like to create a web application that makes it easy for anyone to play with selecting their own graphemes for all the phonemes, interactively displaying the resulting words' spellings.
It would probably be more important to create a web app where you can paste the IPA spelling of a word, and get it converted.
"Data", "those", "these" are plural?
Might be good to do something with she/her, he/him. I've heard young kids get these confused, and maybe they're redundant? (Always using "her", never "she", as in "Her was [doing something].") These are subject and object pronouns? Are there other similarly redundant words? What other words do small children and adults learning English tend to get "wrong"? "Us"/"we"? Subject pronouns: I, we, you, he, she, it, or they. Object pronouns: me, us, you, him, her, it, and them. Old Norse "hann" means "him" or "he", so apparently this happens elsewhere.
Make movie with this language for promotion, used similar to A Clockwork Orange, where they wanted a futuristic language?
In an ideal situation, every sound would correspond to a single letter. There are over 40 sounds in English, and 26 letters. Often people interested in this goal add more letters. I'm curious about taking away sounds, to have only 26 of them. Somebody must have tried this, but I have not yet found an example. On 2014-11-17, Allan Kiisk said that in the Estonian spelling reform, pronunciations adjusted to match new phonetic spellings.
There was a successful Turkish spelling reform. (Apparently made pronunciation more regular, but spelling less predictable.)
These are ideas people from people, not output from my program:
| Common | Suggested |
|---|---|
| burnt | burned |
| cemetery | cemetary |
| stationery | stationary |
| maintenance | maintenence |
| weird | wierd |
| judgment | judgement |
| guarantee | gaurentee |
| guard | gaurd |
| guardian | gaurdian |
| colonel | kernel |
| temperament | temperment |
| favour | favor |
| paid | payed |
| dreamt | dreamed |
| night | nite |
| learnt | learned |
| through | thru |
| camouflage | camoflage |
These are output from my program:
| Common | Regularized |
|---|---|
| weather | wether |
| whether | wether |
| guitar | gitar |
| weird | wierd |
| been | bin |
| guarantee | garantee |
| guardian | gardian |
| which | wich |
| paid | payd |
| think | thingk |
| prefer | prifur |
| have | hav |
Creating a spell check dictionary add-on for firefox.
Bug preventing me from using it, 2018-09-13.
Firefox English spelling dictionary.
Untested because of above bug, Firefox spelling dictionary, with the following changes:
Based on English United States Dictionary version 60 from Jul 23, 2018.
Not long ago, in a video, I heard the word "impregnable". And, even though I knew what all of the parts meant, I didn't know what the word meant. Did the "im" prefix mean "into", as in "impregnate" and "immigration", or did it mean "not", as in "impenetrable" and "impossible"? My guess was wrong, and this is a common problem in English.
So another potential improvement could be to replace inconsistent prefixes. Like replacing all instances of the prefix "im" meaning "not" with the prefix "un". Giving us "unpregnable", "unpenetrable", and "unpossible".